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L/^^n^^^U:, #HK«S*tfcB-VGP^ 
4 ^n^o 7 ^420K*frLT, ES*- KO^ffll^Ho 
07>f-;VK81^^ Ml/ (MVf.top, MVf.bot, MV 
B.top, MVb.bot) £ttff-f££ fcK J: 0 &£*i& 0 E9o 
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M-fifclzmmLtz, 74 -;vK=f- FA:* 

*, it**- MMBr** h**rtm-r*;frffi-C*oT f 

h yftoXXftf h ^7 4 -;i/KSr*"r&»*4>7 4 

-viW^-v^ h y 77 ^ K<97 * 7 — K 

#h A7 4 -;V KO-o£*igi: L, **<0>f^-v<a 
^ f A 7 ^ - ;V K O 7 t 7 - K ^ ft ^ ^ > ;V , MVbot** 

*i* f U^OIt^:, **<0>f>-yO, 7 

££1/^7^7- MM**** MU £ 

[if** 2] «*«li:EltO*St*ot, «tt*>-f 
*-V<D h *y 7'7 4-;i'K&^«iJ1-&£#><07*7- K 
^ft'** h;V, MVf,topii*, (MVtop *TRB,to P ) /TR 
D.top+MVDt' L^^oT^r^^fL, iiT, TRb, topiiS. 

t £*i£J§*<£>^ *-V<D7 4 -;v K4:OHI<0«fHW* 
WMfcWl&U TRd, top*i**0^ > - h^^-f 

[3!**3] l*#*2KE«*>:&8rC*o-C, MVf.top 
ftffllt^^ tLMVtopis «fc tfMVbot tt »Sfc ^ 1/2 )V ^ 
[ft** 4] W**2 ifcl*3 t:!£«Ojftt^o 

-c, mtppi3j:*mto,tqpi*, 'iWEatto7 4 k=3 

«*D*c# h A7-f KT**tf*if3 tf**K93i"*B#ia 

[it** 5] . 11**1 4v>i4 ©h-t^ke**)* 

tS/:ft07*7-K»»^ MVf.bottt, (M 

Vbot*TRB.to P ) /TRd, bot+MVD tCL^^ot, 

t, MVbot^J: 5»tSft4i*0^>-^7-f- 
;v Kt <^|«O^MW4Hffi^JSL f TRD.bot*i**0 
^-y^*fA7^-^Ki:, MVbotKJ: 
*i&«*0>f >-y07 4 K £ OH^HMft MM 
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[ft** 6] «*a5CE*0*tt'C*oT i MV 
f.toptt, HWJsfa^to b 9^- > a feoSIScO 
iM»*:«ffltrft3e*^ MVto P i5J:y f MVbotiiS»01/ 

[11**7] W**5 * fc«4 6 ^E*0**fe"C*o 
T, TTlB.botiJiOTRD.tbottt, Titi£5.£ 7 4 ->UK 
MfcLfc>f h ^7*7 4 -;vKTifc£ 

[ft** 8 ] It** 1 L 7 OV^rtt^Uffi«<0* 
SfioT, 5jE<O^-y<0f y 7 p 7 4-;VKSr : ?iB!l 
-t-Ztz)b<0'<v 9 7- K»tt^^ h MVB.top^S: 

(a) MV B ,to P = ( (TRb, top-TRD, top) *MVto P ) /T 
RD.top, i5J:tf (b) MV B ,top=MVf,to P -MVtop<7)— L 
tzifi^X, ft5eS*t, TR B> to P t4S,ft<7)^^- 
v(Oh 9^7-f-^Ki:, MVtop^i: 0**^ Zit2>& 
*0O-y<07 4 ->u KJ:OWOWp«M»HIHii-itl£ 
U TRD.top*i**0^ ^ -vO h y 77 4 Kfc, M 
20 Vtoptci: $tt^®*0>f ^^7^^ Vt 

b ? 7"7 4 KSr^SOi-^^^07 * 7- V&W) 

[IS**9) »*a8{:E«o*S-c*or f rnffBsS: 
(a) *i, TVl/^^lb^^ MV D =0^i: £ Haii£R£ 
*t, mlffi^: -(b) (4, MVD#0Oi*a«StL& t ii^ 

[IS** 1 0 ] if** 1 L 9 Ov*-f*t**wE«0 
JStioT, &&<D>( *-¥<OtfY ^ 74 -;l/KSr^ 
30 iBJi-^^AO^^ * 7- K#f&^* h ^, MVb.hottt, 
(a) MVb.bot- ( (TOB^bot-TRD.bot) *MVbot) /T 
RD.bot, isXZf (b) MVb.bot-Wf.bot-MVbotO— 
tztf<?T, TRB.botlia<EO>f ^ - 

A7 4-;l/Ki:, MVbotUJ: *}36*i$*i*a 
*<7)^ ^ - y07 4 K£OM<a^Mtt*nHK*tiE 

Vbott^i: > - y^7 4 -)V Vt 

<o m <d w n w * sa hs ic jttE l , MVf . bot a a« <^ -f * - 

v<05K h A 7 4 KSr^aSi-^^^0 7 *7 - K«» 

ffi^C (a) ti, b ^, MV D =0(Di: ^(CJIIR 

S*L f WEA (b) (4, MVD^OOi: £M&i2tiZ>, 

ltt**l 2] — ;v-1£t*<4 > - vt^ioi/^ 
r, h77 p ^iO f jKh^^7>f-;vKS:tt4, aft 
<D, ^mZtitz, 74-;VKn- Kftv^oyn^o 

Sijfv^n^n^ (7 * 7- Ka- Kft*- Kf-*K& 
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3-7 -<D7 *7~ K^lt, SADforward.fieldS-^-t^) 
lit, ^|*7^rn7^ (A7^7-K3- 

h mnmfrX- 7~<D7*7~ K-fett, SADbacward 

,fieid£&£1-*Xgt, ^^i^ii^oSm-r^n 
:/ny* K>fb*~ KK*tj&-T£) KHL, S 

ft, SADaverage. field* IS t, ItfSSADOg 

/M:L/:^ttS3- Mb*- K^Mt^Igt, 
im*mi 4] W**i 2Jfciii3 KIS3£o2f£-e 

£>oT, SADforward,fieldti, (a) ii*0^0^>f 

ft, &J:tf (b) jli*0«i«cD^^^Dyny ^<o^h 
A 7 >f — ;v KKW L, g4<Ov^n7D7^co^h^7 

1 5 ] rn^rn 12*^1113 

ifcoT, SADbackward,fieldii, (a) **<D^i^O^>f 

7Vy ?<DY y-/7 4 -)V Viztt? &mttft%Mfr<0'& 
ft, &£Xf (b) **0^rS<7)-7-f ^n^-n^^o^h 

1 6 ] ft** 1 2i^L 1 5 OV*r*i*fcB 
WJStJboT, SADaverage,field(i, (a) 
J: tf**<0*it5<7>^>f * u :/n y ? <D Y y 77 J K 

Ktr^-r^*6WW^M^O^ft, (b) 

f^tcril, I^v^u7n7^>fF^7^-;vK 

OT*v*;Hf 7**^ ^-v^^lt^, F ? 7&£Zf# F 
A<07^-;vK**t^, aft Ig^-K^7>f- 
K=j- K-fbLfc^^ n/n ? ^T*oT f F ? ^ J: 
CT# F A07 -f -/u K?:tt^aS^7 -f - A> Ka- K 

#-t£*3fcO, 7^-^K3-KftLfc**^D7n 



4 

F^Uk^FA', MVtop, ir^^^ffiflSiaiO F 
X tf# Y a co 7 < - ;v F*<0— o * t f & * 
nyn -7 ^O** hA7>f-^K07*7- K^16^<^ F 
>K MVbot«:ffiiat"4*S^t **07>orn^ 

y 7"i3 J: VsK h^07^;i/ KO^Sr < i: t — oSr^flJ 
-r&fc», 7*7- KiJJ: 0^**7- KSMBr^ F^ 

^ t>CC^ (MV t op*TRB. top) /TRd. top+MVDtc L A s o r 

^, 7*7-K#tt^^h;V, MV£ f tcp*r*5ei"&#« 
4--^^, MVB.top^aft^^^n-^n^^cOb 
7/7^-;VKi, TRtop^i »)**t$tL*a*^>>f 

d. to P ^**<7) ^ > - V<T> \ y 7 7 << - )V K 4: , MVtopi- 

20 ^IBOWMttiMBCWlSL, MVDii^^^»^^b 

MVf.topii, ^fn<o*1p)^0 f7t-va >t4o.|aS 
0»»Sr«fflLrftje*-*L f MVtopi5J:VMVbotii«E»0 

III** 2 0 ] H3t® 18t/:iil9 KB«M>7*3 - 
TRB,topi3«J: tmD,topti, fFf£Sft07>f 

[IS** 2 1] 7iv>L2 O^v^-ftL^tcffi 

f^fn-mot, S: <MVbot*TRB, top) /TRD,bot+ 
MVdU U:foT, S4^)7^n7'D7^^FA7^ 

YZ^m-TZtzft, 7*7- K«f6^^ MV 

f.bot*ft5£1-&^«***, Cir-C, TRB,botiiSftO 
7^nrn7^0#W7-f-A'Ki, MVbotCi:^* 

B#HM*WH!^*ttSL f TRD,boti***<D-^ n «y 
^(0*hA7^ -A/Ki:, MVbotKi 0.**t«*t4a 

[«*S2 2] ft*«2 lCEHOfn-mo 
T, MVf.topti, Hfno^loJ^CO f a VSrfeo 

i5cOl/2;v-7^;v^Kj^^ h^-c*^, t :^Ofn - 

[ft** 2 3] »*S 2 1 *fcli2 2Cia«OT'=r-^ 
rtot, TRB.bot*5ilTTRD.bot*i, mffS^ft(07^- 
50 K 3 — YitLtz-?? a /a ^ jP*#J h ^ 7"7 >f - 
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Wfn-moT, * (a) MVb,to P = ( (TRb, top-T 

RD.top) *MVto P ) /TRd, top, (b) MVb ( to P =MV 

f.tpp-MVtopO— SKLfctfot, S|E07^yD7 

Rb, top^^O-7 ^ n :/n 7 7 p 7^-;PKt, 

- ;v K k <D raoHWMi IWPH^WJCd L , TRd, to P t**3fc 

(0"7^D7'O7^Oh7'/7>f-;VKt, MVtop^ct^ 

{111*9 2 5] i*S2 4.|:EWf3-^T*ot, 
Zb^rfrt&W}'*? h;vMVD-0Ok ^tJffiA <a) SrS 
MV D =#=0Oi§«ffBS: (b) 

fn-mot, (a) MV b ,bot= ( (TRB.bot-T 
RD.bot) *MVbot) /TRd. hot, &±Xf (b) MV b ,bot=MV 
f.bot-MVbotO— -filZ Lfctf*oT, Sft^^D/n? 

MVb,bot^ft^i"^>#S*-&^, ---e, t 

RB^botti^ftO^^ D^D y ?<Dt£ h A 7 >f Kt , 

MVbotKJ: ^ii^S4^-7^n^D ^07>f 

Kt^|«o^HW*WHi^«-lEL, TRD.botli** 
^v^D7 , n7^^^hA7^-;VKt, MVbottwctO 
*i£t d:/d ? ?<D7<4 KtoH 

O^MftftBIIl^WlBU, MVf ( bottta«0^*nyn 

r, $bizTfl'?»Wi'<9 h*, mvd-o©£Skiwib* 

(a) iJitfllVD^OOt *ttE* (b) 

[§fclKOi£ffl*M93] 
[0 0 0 1 ] 

tfftf/vx^ (B-BOP) ^iV^r^ 

;vtf7*:fr>f (#tc, B-V0Pj3J:lf/t £l*B-V0P£ 

3- Mfci"&fc*tcttffl$*L**i!l-f v#>f 

m&zTfmw.izm-i'Ze 

10 0 0 2] 

b LT«^iitf5:KIS0/IEC/JTCl/SC29/WGll N1796F*jO 
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MPEG-4 Video Verification Model Version 8.0" 
1997^7 J3) KfE«£*lTV*&, MP 
EG Verification Model (VM) 8. 0S*a*&(MPEG-4 VM 
8. 0) t EJJfcttiP* £ o l0^2»*a»iiMPEG-4lS(*a* 

ifSUSO/IEC 13818-2M£* Information Technology- G 
eneric Coding of Moving Pictures and Associated Au 
dio. Recommendation H. 262" 1994^3 J§25 B KffittSft 

10 [0 0 0 3] MPEG-4ii, l£&&7 ifctflC 

l£i£H&#fgt£:£3i1-£o MPEG-40fMlfc&7 V-A7- 

fcSftiT*- 9 tiMMftX, 45 J: if 

20 [0 0 0 4] MPEG-4ii^;l/f->T-V TSR*t?^ tTf'^r' 
*Lfc4»*t x.*o MPEG-4<i3&*«&EE*S, 

t^yxn-x^-n^jf^ smwswiniasi^ 

[0 0 0 5] MPEG-4 t'rt VM 3 - y/f.a - ^(code 
CTMCJ: (ff^tSWo t^yxnW7^77 

30 ^^yt rt^-*#m^lt(CAE) r;v 

^i;XASfc*±^jES*LfcDCT3-^ (k *>KB#PiStfJ^ 
,3.-^^774 y f-V&btl&Ts-f? << h (sprites) & 

-5 *m<Dwwtt&#ttzm$<oi&mtztt utffl^tL 

[0 0 0 6] «»««?ti/:fnfir-^3- Yittf 

40 &W)Mfc& xzfnmo&EMc) tz h xf\zmx?t{2-v) &m$c 

[0 0 0 7] 

[||^A f »*LJ: -9 t-r&««] L^L, B-VOSrtO-7 
^n^ny^ (MB) ff*tl£ £4 > 9 - 3 
so - Mfc#ii f ^o/Sfeii-f U-^$*t, 3-K 
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[0 0 0 8] «f£, B-VQPrtOMBK*tL, M> 
(MV) ^«t*jL&»*fl9ft««* i aS*t't:v^4o *fc 
B-V0Pi*3O7>f -;v Kn- K4tiffi 3 S:Itt«WK*~ K3- 

ir^< 9 7 4 -/U Ka- hMbB-V0Prt<OMBU*tLT, n 

[0 0 0 9] 

K^Lfctf^**:/v^* (VOP) K43tf£, 

X-im&VJ > - via J: Zf/*tz\tm&<D4 * - v * 
- Kfti-* fc*K«Jfl L/:S*-f > - vt!>M > * - V- 

[0 0 10l*WO»-OBatU t -lOfy^V 

jVKHov^D'/d^ (MB) cOJ:^^, ^ftO, 51 
^InJtc^ffiOL^, 7^;vKn- Kft^>-vWl 
r, fig**- K^Kj^ Mi" (mv) S:tttti-*;£ffi-e 
£>& 0 h y/iJitftfh &<T>7 4 KS: &ofi*0 7 

Xf# Y A CD 7 4 -;V KSr 0**0 7 >f K^- Kit 
L/c«t$l^^^n^o **0>f>-vl4 f MVtop, i" 

^^*>**o>f *-V<r> Yv?7 4 -n* K07*v - K 
MV**tJE«*0*f >-y^)F^7>f-^ K*»# U7 

-y^««Ltfi$Wo I*t Lfc7-f KI4, 
&&<0 4 *-¥<D Y V?7 4 -frY'K&^X, MBKfcfL 

ftaoa^MB^-g-tfo 

[001 1 ] £ OMVii, * *l**ift*0>f > - v (fc t x. 
If, ^Ttt-K**?- K) «r«*i:t4^ *<o^«ij 

**-Cli7*7- K) "forward (7* 7- 

[0 0 12] MVbot, t*t)^*^^f>-y 

<D#Y A7>f -;i/K07*7 - K»Mr** 

^ ^ - v?C0 h y -7*7 4 -;V K*# h K*»OV* 
i*ft***r*i*fci"&o 7*7- KfcJitf'***?- KMV 
14, *^0-y0, ^f^7^;vK^)7i7- 

h * 7 P 43 4 Xf/ i tz 1 4 h A 7 -f - ^ K & ^ SO "T & tz ft 

[0 0 13] MVf.top, -t*t>*>m&<0<i >-v 

<oyv*?7 4 vi^m-f&tztxny * 7- K»»^ 
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^ WW45ft, MVf pt op= (MVtop*TRB.top) /TRd. top+MVD 

"r&f*^^»16^<^ Mve**), TRB.top*4S44>-f > 
-v<0 h 7*7 >f~AK£, MV top K4 

j£L, TRD.top^i**0^>-v^^^7 P 7^-;^K 
t, MVtopK4 l)^t?tL-&a*<0-f >-v<07>f- 

t4, >f^-yi s $/T?^71/-^U- fuBaa#»t-fe 

[0014] mmz 9 MVf.bot, i-&fo*>«tt<0>f >* - 

yWh A7>f - A, Yi^mir htzft<r>7 k^Kj 

b;H4, 5fc, MVf,bot= (MVbot*TRB, top) /TRD,bot+ 

^-yo#hA7-f-A'Ki: l MVbottci *)&m.£Zi% 
tmL, TRD.botti**^ 4 > - M7>f-;VK 

[0015] MVB.top, 1-^t>^P.ffi<7)MB<7) h77 P 7^ 
-;v KftTSli"**:*^^^ ^7- K^»^^ h ^*4 
MVb.top- ( (TRb. top-TRD, top) *MVto P ) /TRd, top 

(7*;i/^»tt^^ h fr, MVd=0O^^) , f /cUMVb.top 
-MVf,top-MVto P (MVd^OO^: #) IzLtzifivXVkfcZti 

[o o i 6] Mv b ,bot, lrt£t>hW&<rm<o#Y A7^ 

-^K*^«li-**:*0/<y ^7- K^»^<^ h;n4 
^t, MVb,bot= ( (TRb. bo t -TRd, bo t) *MVbot) /TRd, bo t 
30 (-rA-^ffij^ h ^, MVD=0Oi:&) , t ^(iMVb.bot 
-MVf rb ot-MVbot (MVD^OcOt IZLtz&oX, 

[0 0 1 7] fg?BOtec7)g|«£: Lt, -!Of?^/Hf 
>-vtCi3V^T, h y7 r 45 4Xl f #VA0 7-f-;i/ 
K«:iog4(7), ^S0L/c7>f Kn- HtMBi^^L 

-Ki4, ^y^7-K*-K ««IIB(4^M 
fitjt:, S^Jiim-e^^MBcoma:^^) , 7*7-K* 
— K (Ci^«4, *mMB «4K*tOMB<0«ri:**) , * 
40 fc(4¥*& (/c^x.lf, ^*^) <^r-CI4, 5feO**MB 
i3j:U f «<«mMBO¥^^ffl^ti^>) ^**o 
[0 0 18] *3jftti, J§*0*mMB (7*7-Ka- 

&fr^7~-(D7 ^ 7 - K^ft, SADforward,field4-*S 
tSIgtltfo SADforward.fieldti, S*0*mMBV^ 

43 & ft® ^ ® -&MB t aft <vmt <o m <nm^ ^zi-y* 

fi:«»tii7-.t:*to **0*i»MB * 7 - K 
a- Mb*- KicJtiDi-*) i:-BIU aftoiBKit-ta 

ftJtlfl-i 7 - <0 7 * 7 - K-g-tt, SM)bacward, f ield *> 
50 4fcjfc5e*il&o SADbackward.fieldti, **^**MB 
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(C i5 tt -S. ^ t mft OMB CO O 5 ^- > 

* -(t C is tt -5 * 9 - £ 3*1" o 
[0 0 19] **J3itfii*(OSiiMB 

y-<D¥%) / &gt, SADaverage. field <b t tl&feZfl&o 
SADaverage.fieldti, <t tf** CO^i^MBCOSii^ 

ig-g-MB WW <OMB t ^fiEGOMB t <0 H CO M3t ^ ^ 7" > * it 
Uijlt* J- 9- £tf -to 

[0 0 2 0] 3- Kffc*- KliSADO«/McL3t**o-C 
StR£ft£o -f-ft-TftcOa- K-fk*- K<O;&IlfcMVC0& 

[0 0 2 1] SADfonrard. field, SADbacward, field, ■*= 
i IfSADaverage, f ieldfi , h V 7°i3 ± ^ ** h A 00 7 - 

;v K ic to tz o X % oj&frfB m * *tt"T * - t K i •) &5e 
[0 0 2 2] 

**yyi^ (B-B0P) leisure? D7D 

(mb) oi^ifv^jvef^-f^-v miz, MB 

fi7 -f -;V K3- hMbLfc^jEcOMB (7 1 — A3-K 
-ftMB, ^^{/IZ^S^r^f^^^^^PMV^-g-tf) CO IS 

(PMV) **tt**«-t*o mm**- K*S 

«?-r * » tt k*4WH n <o*o k sn x. , 74-*v 

3- K-fkMB»c*t1-*iS»3- K-fk*- Kt-^x. 
[0 0 2 3] HI Iff* • *7-$>i* h 

• yv-v (V0P) Hk* i^-f-fkoyn^^Sr 
la^-T^o 7 U-A105A*Hocoil^Jiu> > h (IE^ 
^m*^U>>M07, «^RtWf*iW> > M08, *3<fc 
l/llj^C0*tl^U> V H09?r-g-tf) Sr-g-Oo 7V-A] 
J-U>VHi, V0Pn7**jE*J^fr*^1^> 
> H074-*L, V0P118* s ftfin^P^ > M08fc*L, 
V0P]]9A s lilJgcO«tl^U> > VIM* ft? X *) \Z*.>T* 

£„ V0PJitt^O®tft«r*L, V0P<O^M#tfT'^*-7*v 

ttzvopx-bh t^TLhtiho Ltzrf^>r, "vop" com 
Riiii-cJittAoau^tt* (WxfcWEJB) <of >- 

■7^^**gEfilC0S£ffiSrffifflLT»t>ix, ITU-R601#jRr* 

-f<?>i><ot,zmz7*--?v Y*m-?z>o zmmw-r 

[0 0 2 4] 7 1^-A105i3 J:?/7 U-i.ll5^^COV0P 

r-?nmm<Dft^t&mz®i&2n.i>o #k. voph 

7, 118*3^^119^, i> 3-^137, 138*3^^^139(0 
iSV>T*il-e*i, $lb £ <fc 7 * 7. 7" ^ -^-fk 
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*£tt&<> SJ#3- Mkii: ttc, :M) -i3i U f :J'• 
^-*7^^^#re#fta ^ #-SHk£i^*o 9Wi3-V<tt 
t1>lZ, ®#1fffil;t7l'-ArtO&I!i&££'(£fl§LT3 
- hMk£:H-&° x^X^^-n - K-ffcii: fcK, DCTC0 
-t o 1t&m9&tW, fz&Kzi- MkSrvfcpT^S 

[0 0 2 5] 3- Vit2tifcV0?7-?lt<k\ l Z, fr* 
)\s\Mzt>tzi>te$k<otztb\,z, -7;ufyu^t (mux) h 

10 TE«*ivC*> -kvv §41 Lfc3- K-fkVOPr'- ^iir 
v^f-^l/^'t (DEMUX) lei l?«-«t?n, S-filL^VOP 
117-118ti«-f-'fk?^, Him^n^e 7U-A155, 165 

isii/nsji, V0PH7, nsisicfiig^-eti^tL, W.^r 
4t$tL, m&Zti, LtzifioX, tztz-\f\£7'*7 4 7 
? ') —170^ -Y > ^ — 7 i ^tSitS^ Z/tfV v 9 1 

[0 0 2 6] 3^-/7^i, a.-^f-C0i^lcES* 

7-f75'J-170»±, SMLfcVOPill**, illFlc^ff 
L/CV0P178 (^Alf, R?^) &$tfZ\t&X'%i>o 
•y- (i, R)^coV0P178SrE^?gcOV0P117lcfi # ^ 7 
U-A]85*l*it4r.i:* s Tl^o Ltzifi-oX, 7 
U-A185J4, Sffi4^/:V0P118, 119t, JfgB«nc«# L 
fcV0P178t Sr#tr 0 
[0 0 2 7] flfico^II h. LTti, WM<OV0P109«r, 
-<OS«?L7^^«lcM#^x.4i-t^T-§*c fztk. 
If, xUkrcO-^-^S5c^^^,*t §IC 7f)>t- 
30 fc, v^-coi d ^=^*:^ ^*J5I LfcVOPco 

iiC3 - K-fkLTtiv» 0 3---r-li, 7^77';- 
170^?>, Xli*MB^*lfll#0'f--W*/i'OJ:9*tt 
c07We#a^*bWS*S*?Lt#2»o Lfc^ot, -x- 

[0 0 2 8] e*7'*9'f 79 ') -170ii, f-+>^;W45 
^^LTS:^$ix4V0Pfc«*-r4wi#-C§, f/c, 

>f>^-^-^ hcoid'fc^^ w -fifth rvop-^flfe 

(fft-b7V3>li, 00V0PX*iV0P<O v - 'r > x * 

[0 0 2 9] HI 60tf7'**7*v'i^ K03- K'fkZttF 

A , (RffJS^, ^9 7 ^ * ;V • • r^-7i 

t'T-'^-^a, 0?-*'^'7yjt->a-/ 

3-K4k$tLfc («x.tf, 7-f-^K-t-K) V0 
P«r toME/MCCOig^Jl, i ^tl*Sr4-x.4o 
[0 0 3 0] -02 ti, Ltzifivtz3->3-y<D 
so 7D^if*i„ x>3-^li, ^aijn- K>fkV0P (P 



(7) 



&m¥- 1 1-7*191 
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-vop) Rir$Ljj$)^- Mbvop <b-vop) (omxto&m 

[0 0 3 1 ] P-VOPii, -f > h77l/-A*- KXI^ 

ft^8^7?n?D^ (MB) *-&*»& 0 
7U-A (INTRA) (03- Mkt i: tic, v^n^ny 
* (MB) li f 3-MkS*t*o 
-f>*-71/-A (INTER) 3- Kffck i MBli, 

7 (/:titf, VOP) Ji, P-VOP (B-VOPTttfc 

^) -Ciitiitfftfeftv^ I-VOPrt^SLTn- Mb$ 

10 0 3 2] 7*7- K^fflfct <>IC, aft^MBtt, T 

^^*^i-^o (MV) **, 

?-VO?<Dtztb<DT h^SII^- K**ttJB$*i, ^ 

w *C, ^tbffiftii, 16X160MB-C(i^ < , 8X8oyny 

7I/-A03- MkL?tP-V0POMB^M^^, 7 1/-^ 

[0 0 3 3] B-VOPt*, *?9 r ?- K*«. SUfrlRff 
SO, isJtffiSHe- K (iitfe^r-f >^-7V-AS 

SrfctfK, P-VOPKM&LTJL&LfcJ:?* 
7* 7- K^aO^e- KfcttfflT^o B-VOPli, MPEG-4 
MV8.0 (DTT, ^>h771/-An-Kftl/:MBS:^li 

A (fciitf, VOP) (i f P-VOP (B-VOPTMi&v*) 

[0 0 3 4] B-VOP<0^ 7^7- K^SO k t t K, S«£ 
OMBtt, B#HKJ^MtCab^)7 >*-7 AcDMBo^- 
^ffij&kifciRsa, s^o®^^*^i-^o 

ftjMVri* (7*7 — KMVk LX4&htlX\'*Z>) , gi&OJg 
^OIffi^^*ffi«<OMBOffii*a£ffi*Si*o B-VQ?<D%L 
JfWIiiti:, 3i£OMBti, WfIH«jU1»U**T > 
*-7 U-ARtf#Wltefctt< T>*-7 l/-A<OMJ 
OMBO^-f-^StJtteStl, flOM«:ftStSo 
7*7- KJlf;^?? - K»»MVri*, S£<9S<£OMB 

[0 0 3 5] ftK«<P-VOP&wE^JS*ifc^* n^n y 
^**8x80r K/<>* h^Sa^^ KSrffiffli"^ t, B-VO 
POltg^t- K^aSk t i> 8X807D y ?<Dtz*b<D 

Itffit/^Ot-f ^^St^f B-VOPO^ny* 
(DtztXD&W)^? h^*rS<fc*tl f P-VOPOSxSOy 



12 

[0 0 3 6] -«Wic^^iirv^ -xvn- 

ttffi230£, r^^^^-3-^240iSr^ f 
Kl#i£«£t£220, ^li«{Smtg230, 

240XU P »«=J- ^210(4 Sfc, MPEG-4<7V*7>-** V0 
P_of_arbi trary_shape"£) J: ? fcVOPJKtMSfRA* *> * - 
^ ^-;V207-C^i-^o ^A^-^HDtWi: 
10 V<Pli«#»^«tt**L, IWot, JB#=J- 

^210i4'l$JB£*l&V* o 

[0037] mmm * tutrix - vop«tg25o**, &n 

Jije«IB220*V»»*l««l6230*C iot*fflt4fc6 
Of5««;Six/cT>*-V0P«:-^ic*o «tt<OV0P**, 7" 
^^^^-n-^240Ti>3- fMfc#*tfcW» (resid 
ue) *#;l&fc«>K, ^7 r h7^^260^^t&?i{@:^tL 

4014, DCTKJ:^ -7;l'f-7 p W^"9- (MUX) 280^7^* 

man, ®&%m .mH^o 

20 3-^2401*3*:, ?5*JS'S*tfclWOT>*-VaP 

«S6250— XJj-fZ tz#><D f jDffS270-C^tt«ftfl230 

[0 0 3 8] #I61f$& <«x.tf f (4, ^ 

tt«5e«Sg220^^MUX280^i:4*X-^*t, V0P^^«$r* 
•fteftffi&li, J^3- K^t«tt210^fellUX280^#x. 

«bn^o mux280(4, Hfc-tz&mitLtzr-* x f y- 

7 7 7 290^-5- x_£o 
[0 0 3 9] x>3-^AW^Iff-^li, YU 
30 V4!2:07*-V7 h^^i-^o VOPii, 

[0 0 4 0] U3(i, *»«Ot-fOfcAOimiS 
«r^o ^»it^S^ltj«{S (ME/MC) li t -fete, 

a4otfft7i/-A(oyn^ (^jAif, aftco^n 

^^) iSiS7l/-AW-fW:i^7'n7^ <« 
40 ^.t^, ^a«L/c7'n^^Xi4^*7 r n^^) £ 

^n7^iiico7U-Arti:*-So R^rrtl^fl <B) n 

iH^n ^^ogEttti, i&m^tYfr (MV) ^ti 
«i, (x) X^SK <y) J*»**i-*o MVj£#oiE 
Offiti, ^aiLfcyny^#aft<0^n-7^O2&*|pl3i 

[0 0 4 1 ] »««L«»rcy T«Lfc^ 



(8) 
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HZ>ZtizX^>X, ML/:S«0/o^^SiSt 
>&ut«l^o ME/MCO^n ^ ^(i, 16X1607U- 
i±-?uy? (7^nyn^) X«*8X8<E>7 A 7n 
7 ^3U*16X 8 co ^ — ;v K^u y ?x$> »m&o 
[0 0 4 2] MVOffiKli, l/2lffi#KK:5££*i£o HMD 
ft**, 7>*-7U-ATH£ffl£ft&ltfttf&^-f, P 
(i+x, j+y) ti, xXlfy^JtLTjettSii. 

ott«tt, a, b, c*ot>tsi-j: +-r*$*i 

£o l/2H*<Dl£«li, a, b, cRtfdT^-T J: d fcR-C 
^*to UTOfciS 0, a=A, MA+B)//2, c=(A+C)//2, 
aUT)=(A+B+-C+D)//4-C* I} , -d-T, V/ w l*A»fc*i* 

m*)%i7fi-to nm<ottm&, mmwEG-4 vms.o, & 
h¥w<D&nmjs.Rtmi* n tm-rz, - 

®^08/897, 847 (1997^7^ 21 B tUH) (£ OZStSfcli^ 
[0 0 4 3] 0612, *«!»KtEort:, S$7U-A/ 

ffiif, 16X\6<Dm^<D^&<D-*? nyxj y ? (MB) tf, 

£ G MBOH^v^ ~0016X8<97 >r - 

;v k^d 7 ^ ofW)— (7)7 -f K7^f wc^ov-y^it 

4 



10 



20 



30 
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glJ*llME/MCn-- YitZtih* 
[0 0 4 4] 7 4 y<D16X16W? D^n y ^ (M 
B) rt*, ##600^2 *tT^& 0 MBii, «3£#S<07>f 
V602, 604, 606, 608, 610, 612, 614, 616£, 
§07-0603, 605, 607, 609, 611, 613, 615, 617 1 

2) <D7 4 - frV**ti^tiJ&l8Lir2>o 
[0 0 4 5] >f >-v600OB*7-f — "7>r 

fc, ?¥-§-650-C^^tI>5>*^^ y ?tfJ&$L$1XZ>o 

##645^S*l££c£Pl*, 7^ >602-617*>W8<a3fcU f 
#x.£7F-To ffllictf, MB600^«l#SO7>f 
iStfc*107-f >602Ji, MB650O«l#SO9>f >Tt 
*>&o «»fl^9>f >604ii f MB650O^2#SO7^ 

O7-f>606, 608, 610, 612, 614, 616te, Zti^tt, 
IB650O*3#i*5>*S8*eO9-f J: 7 fcH 

x8<O;l/^^>^^[it680^fi£$tL^o fflMtfc, 
S07-f >603, 605, 607, 609, 611, 613, 615, 617 
l±, 16X80«*685*JK*i-&o 

[0 0 4 6] P-VOPOfcfcOMC*- K S: X^-T tf> 
*^7 p nrtr^(i, tlTOH^t*^ 7l/~A.eft 
Ki:WL, flMaJK, oi6xi6oyn ? ^ (09* 
if, SADi6<MV x ,MVy)) X^ffl^8x80ray^ (ttx. 
If, SAD8<MVzi,MV y i), SAD8<MV*2,MVy2), SADs<MV X 3,MV 
y3), StfSAD8(MV X 4,MV y4 )) Ofc*Olfi*t£4WD (SA 
D) £«*o TEO*10»* f 8X80¥fll*SttL. 
l6xi60^SiJ^«^f &o 

[0 0 4 7] 
i&l] 



If 



I-' 



SAD 



40 



* [0 0 4 9] ^W*- Kft5£0£»«i f HTOS-2<Of 

(b) #fk/hr 8x80^ib^<t (7M>^ 
b^SJ^-K) **«JBS*t&e ft2o (c) 

■liNb/4+l*-fc»5>:h.*o 



[0 0 4 8] V-^ LfckTr^CfcfrLT, 

top(MV x _top,MVy_top ) , B.^ SADbot torn (MVx^bot torn, 
MVy_bottoa)£*#&o - - {MVx_top,MVy.top)S^(M 
V x _bottoiD, MV y _bottom)ii, Yv? (Mft) XWf 

(bottom) 7 >f YcDtzZXD&W*'*? Y )VX-&2> 
'^\z, 7 4 Kcoi/2-9->-/^^-^^^*/hOSAD 

(ffilictf, SADtopRU f SADbott«0/c*0) Sr*-T^>* 
$7>(-^Ki«t^o 

(a) SADJMV x ,MV y ), (b) 23^r^.^ + <» < 

and (c) /MJVA^wJW,_*J+M^ ' 

[0 0 5 0] SX&tD^Mi^'MIR^titzm^, OBOC08 X 8 50 t7);u = y ? ir09o(OMV***)* (i-**>*>, 8 



& "65" 
[»2] 



!8S 



(9) 
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tL 
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*161i^fl^^L^^o Tffi<7)*l «4, 9u$i->XW<D 
fcA01/21i*tt^01/161I*ffiOSE**:^i-o ffili' 
if, 0*b2/16stoKft£b*i, 3/16* 13/1639*1/2^* 
*fc*t f 14/16Xtf 15/16*2/2-11:: A* fc*l*o 
[«1] 



1/imMM 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 
l/21H&tt 000111111 1 1 1122 



10 0 5 1 ] 7>f-;HWIi f 

(Wxtf, 7^>602, 604, 606, 608, 610, 612, 61 
4, 616) 14, 1»JfeSilfc**7-f->PK*«ffl+*hy 

Mttnejoaett*. &&<D&&mzz^xm2ti2>±i 

\Z, 7 U-AOffi«K#5eS*i* 0 l/21®*OSii:^f6] 
<7)*7-fc y h**3F56$*i£ t, B-^i*7^-^Kfi 

>*fc<OW*£*t*»*^*>S*i*o 20 
[0 0 5 2 ]. ~oc7)^ n ^ -fv^T'n ? ^ (DtzZKDlM 
14, #^<OJS^4t— #L, ^KJATOJ:-9HLTA*!>& 

&o 7k¥l£#l4, ^tojMSi*i/2«{o*7-b7 f 

mmzm-to 30 
[0 0 5 3] r Y^>*v^mwfc<on2<omm*, n> 

* t^-/D7 ^OT^^OMCSra^^*, MPEG-4 V 
R8.0* Itf i-f 7 V >^*iw«HBOXKtC»BlSn-C^ 

[0 0 5 4] B-VOPH^i" ^>4f^O n - hMLi£**£ Z. 
■CBiW^tt^o B-V0POJ: 7 WINTER 3 — KftLTVOPK 

■=& — K, 45J:lF7 * 7- K*- K**£ c «#<0 = ^I4 

B-VOPtf^fiSLfc^n y ^I4&^e- Kfc#LTV» 
^v^KgfcjgStt&o ^ fb^, B-VOPO^n v XV 
T>*-7uy ?tf-?u7Vv v7 (*:ix.lf7U- 
A) -n-KlfcSti, *fcl4>f (fc£:x 

[0 0 5 5] — o<ob-vop*, K-c^aosn 

fc#*fc*llB*r<>o£i*T&* 0 "B-V0P" kv^lff 
*4, R^lRltc^iBIL^yn ^ ^**$*LW*i <hO<^& 
^L, £0£fcl4K#$*l&V*o ttmttiiZ, P-V0PJ3J: 50 



tfl-VOPi t t>ic, ^faK^«0L/cMBt4«/B£*i£ 

[0 0 5 6] KB-V0P MBtZft LT, MV(4M# 

tticn- K<tS*i£ 0 7*7- Kfcitf&^ltG*- K<0 
7*7- KMVK*JU &feU f *C^^^7- KiJiOTl* 
fpj-=e- KOA?>7- KMVKWL, raC9!l<oaftOMB(0 
ItffT^MB M BC^><r <OMV (fe-fcx-tf, 7*7- 

[0 0 5 7] ^C*>f T^MVtrttJBL, ftaSMW**5* 
4: H 9 **-«#*£*&*!*::, ± 

frbTX&ZtfcfcLX, 2£0»*i-*IIV07*7- K 
MVJ4B-V0PO3R*EOMBO7* 7- KMV^WL, 7^1/fV 

y * 7 - KMVt4B-VOP<0S£ OMBO/* y * 7 - KMVtC^ 

&o -t^^t), SftOMBK**LftSSft&MV£7 f V^4 

K{sj£S*L£o T y n-^^i4, S ; tt^MB<7)MV(4PMV^i: 

[0 0 5 8] ««EOMB3&«VOPOS*iCtt*-tJ6»-& f S 
£<7)MBH2}-r £7*1/7^ ^ ^-(44fnt*K5c$tL& 0 
[0 0 5 9] >f • hMfcL^B-VOPtc^ 

*m~tz>o izgoo : ?i!ijMvt4, aifiw*^, m^t>*- 

MBO h 7"7 -;V K7 * 7- Ki3 JlfsK FA7>f 
K7*7-K f %bmz>£<DT>*~m<Oh 
;vK7*7 - Ki3i:^5KbA7>f-;VK7*7 - KS:S 
to m^m^Xlfy ^ 7- KMB, i3j:U f /Sfc(43R« 
<DMB£5 J: tf^-c 7^7- KMB(4 , aft^MB<^ME/MC3 - K 
ffcKttLT«ffl&tifcv\> B-V0P(4INTRA3- hMfc.L/:MB 
B-V0PO^MBi4ME/MCn- Kft$ji&iit:4 
^>o 7*7 — KJ3J: Cf/^ 7^7-K7>* -MB(4P-V0P 
S/c(4I-V0P*^>-C*i}f»T, 7l/-ASfctt7>f 

[0 0 6 0] -O^-U-^L/c, KOB-V0 



(10) 
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PfcWU m^><O^J^^mW)^^ Yfr (PMV) tfVXT 
[0 0 6 1 1 



b 7*7- K 



b y 77>f-;v K, 



7 * 7 — K 
» 7- K 



PMV^ 4 7" 

0 

1 



[S3] 

-7^ P-/P y ^ 



-7 A. 7 * 7- K 
7 I/- A. ^7- K 

ZjLnjkJL 7*7 - K 



0 



■0. 2 



2, 3 



0. L 2, 3 



[0 0 6 2] tztz.lS, S3l47*7-K^e-K (fci: 
itf, "7>f-;VK, 7*7-K M ) 3R^07 

K ( "0" ) i3 £ Xftf hA7-f-/UK7*9-K 

( "o w ) 8i^h^fo^#tt«-?Wo 

[0 0 6 3] KftK#v*T«/B3ixfcSfc, a* 

OMBO^ft^? b;W4f£2iJHJ£r, SKMB^WLPMV^: 

-t<0afi(i-3fc^y!)O»*> 0 Kfclt&MB^MV(4SttO#|0 

iph-?hZ>o -f\*T4 ?9~\±* fciSS*- KOMBK** 
LTl£fl§£*i&v> 0 JfctfjffiLfcMBKttL, PMV««fikO 

[0 0 6 4] B-V0P<O|ft«*- bMfCto-C 
14, b/Hfc#ttft«3it*v\, ftfcoT, 7*7- 
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* V££lfs<y * 7- KMV14, B#MMK#:<OP-V0P MB<DMV 

T EStf*^ ft * o Z-<or)V? MVIiW * 

[0 0 6 5] «4i4, PMV#5fei3J:tfS#OMB*>r 7^ 
go^fcS£^B-V0P<D#l&^ b;v£n- bMki"*^ 
*H«fflSfL*ii*rJfjfiLT^i-o B-VOPKttL, T 
aPMttr** pmv[ lo^da^x^ft, -fe*n*^ = 

O (fc.tx.tf, pmv{0], pmv[l], pmv[2] £ i tfpmv[3]) 

io KfiJ 0#tf<bft£o :o>f ^*pmv[ 

* FiW:L/:^ot^t^/:^i:, pmv[ M 
^^«r»J6^#& 0 B-VCP«ra- Kft-LfeflkKp PMV^^ 
b;vov>< o^tiiSftOMBO^iS^^ b/i'traCfc** 
*7fc»»r*ft*o Zoi /-l4fflo<OPMV 

ttSft^MBt-BB* L fcMVOQ&Kflcff Lt5i $ ft* 0 
[0 0 6 6] tzttlf, 7*7- K, 7 4 - ;V K^SO L 
fcMV**— 0088^^ b^Sr*i-^o ^wT, pmv[0]t4 
b 7 7 r 7 -f-^K, 7 * 7 - KCtt LPMV^Cab 0 , pmv 

20 [0]ti5Kb A7-f-^K,. 7 * 7- KKJtLPMV^&ao 
/<?*7-K, 7 4 -;v K^fflLfcMBKttL-C, pmvl-2] 
14 b y-f7 4- frY'*v Kiwi*LPMVC**o 
^fS], 7*-JUK^«L*:llBfc»U pravt0]iib^^7 

K7*7- K»w»LrPMVr*9 f pmv{l]li#b 
A74-;VbV*7*7- Kfci* LPMVr-ab 0, pmvt2]l4 
b y 7"7 -i-fr ¥** v 9 7 - Ki:W LPMV^* 0 , pmv 
[3]i4** b A7 ^ hv<? ^7- Kir^f LPMVtT^^o 
7*7- K2fc<4^y*7- KMLfc7l/-At- K 
B-V0FMBK** L, Pi— (DWtfh *) , pmv[0] <Dfrtf7 * 7 

30 -Klc^Lffiffl$tt, pmvt2]rt*^y ^ 7- Ki:WLt« 
ffl^ti^c (fciictf, 51*163) ^SiJL/c7lx-A 

KB-V0PMBK#U ZlocOMV, i"**?*>7*7-KM 
VH«"t SpmvtO] i3 £ tf/< y ^ 7 - KMVKWi- ^>pmv[2] 
*«**o yO»5feLfc "5S«fi-^/c*!)Opmvt ]" ii, - 

[0 0 6 7] 
[«4] 



*4 







at 












7*7 






7*7 


/<y 2 








- K7 


7 — K 


7 \s-2* 


- K7 ^ 


7- K 


7 f - 








7 V-^ 


K 


-;u K 


7 - 


;i/ K 






■t- K 


K 




K 


;v K 
K 


^e- K 


















*!><7)pmv[0] 


non 


0,1 


2.3 


0 r l r 2 f 3 


0,1 


2 f 3 0 r 


1,2,3 


mm*** 
















A6<Opmv[0] 


non 


0 


2 


0.2 


0.1 


2.3 0, 


1.2.3 
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%>Z> t \± ft 7 o L^L, Z<D£ v &fc<DI4 

[0 0 6 9] INTRATn y ^DCiSlo'Jf dct_typeO 

7^^^^li-?n^^ltiS'^T, MPEG-4MV8.0 

uSi^Stirv^i -5 iz^ffztika zo&mt, dctt 

ype^Sft^MBiS ilflffi^n y ? KftLXm i> 
£<7>^pjfg£££c dct_type***4&£, AC^ffll 

[0 0 7 0] 134 li***i:Lfc#or>f >*-U-* 
• MfcL/cB-VOPOhy/^-f »-^KOie»^- K 

**, (1) 16X16 (71/-A) MB, (2) -Oh^MBfTt 
14 (3) 8X8 (5fc?rFffl) Oi 3 fcn- Mb**** £ 
K14, a^Onrn^ (MB) CWLt, -fuy 

[0 0 7 1 ] fflS*- H*«tt, *W«t«^*t Lfc** 
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1. Title of Invention 

PREDICTION AND CODING OF BI-DIRECTIONALLY PREDICTED VIDEO 
OBJECT PLANES FOR INTERLACED DIGITAL VIDEO 

2. Claims 

1. A method for calculating direct mode 
mot-ion vectors for a current bi-directionally 
predicted, field coded image having top and bottom 
fields, in a sequence of digital video images, 
comprising the steps of: 

determining a past field coded reference image 
having top and bottom fields, and a future field 
ceded reference image having top and bottom fields; 

wherein the future image is predicted using the 
past image such that MV_ e? , a forward motion vector 
of the top field of the future image, references one 
of the top and bottom fields of the past image, and 
MV ba . , a forward motion vector cf the bottom field of 
the future image, references one of the top and 
bottom fields of said past image; and 

determining forward and backward motion vectors for 
predicting at least one of the top and bottom fields 
of the current image by scaling the forward motion 
vector of the corresponding field of the future 
image . 

2. The method of claim 1, wherein: 

MV ; tc?l the forward motion vector for predicting 
Che top field cf the current image is -determined 
according to the expression (MV, op *TR B . top ) /TR D . top + 

mv d ; 

where TR B Cop corresponds to a temporal spacing 
between the top field of the current image and the 
field of the past image which is referenced by MV too , 
TRo iCOp corresponds to a temporal spacing between *the 
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top field of the fuzure image and the field of the 
past image which is referenced by MV :up , and MV D is a 
delta motion vector. 

3. The method of claim 2, wherein: 

MV r _ top is determined using integer division with 
truncation toward zero; and 

MV ccp and MV boc are integer half -luma pel motion 
vectors. 

4. The method of claim 2 or 3, wherein: 
TR B i top and TRu^op incorporate a tempcral 

correction which accounts for whether . said current, 
field coded image is top field first cr bottom field 
first. 

E. The method of one of the preceding claims, 
wherein: 

MV f bot , the forward motion vector for predicting 
the bottom field of the. current, image is determined 
according to the expression {MV bet *TR B , fcot ) /TR D#bJ>t -f 
MV D ; 

where TR D tK5C corresponds to a temporal spacing 
between the bottom field of the curren: image and 
the field of the pasi image which is referenced by 
MV tot , T?. D >bot corresponds to a temporal spacing 
between the bottom field of the future image and Che 
field of the past image which is referenced by w boc/ 
and MV D is a delta motion vector. 
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6. The met hod of -claim 5, wherein: 

MV f boc is determined using integer division with 
truncation toward zero; and 

MV top and MV^ are integer haif-luma pel motion 
vectors . 

7. The method of claim 5 or 6, wherein: 
TRp 

toe and TR D<bct incorporate a temporal 
correction which accounts for whether said -current 
field coded image is top field first or bottom field 
first. 

8. The method of one of the preceding claims, 
wherein : 

MVb.top' the backward mot ion .vector for 
predicting the top field of the current image is 
determined according to one of the equations (a) 
MV b . top = ( (TR B . c ^-T^.. op )*MV rnp )/TR^ cop and (b) NT^.cop = 
MV e . tnp - MV top ; 

where TRQ yCop corresponds to a temporal, spacing 
between the top field of the current image and -the 
field of the past imace which is referenced by MV cop/ 
TR o.cop corresponds to a temporal spacing between the 
top field of the future image and the field of the 
past image which is referenced by MV to? , and MV £ , top is 
the forward motion vector for predicting the top 
field of the current image. 

9. The method of claim 8, wherein: 
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said equation <a) is selected when a delta 
mori'cn vector NTV D =o, and said equation (b) is 

selected when MV D * 0. 

10. The method of one of the preceding claims, 
wherein: 

WVs.bot' backward motion vector for 

predicting the bottom field of the -current image is 
determined according to one of the equations (a) 



MV bibot = <(TR B>bot -TR a>S)0t )*MV bot )/TR D . bot and (b) MSJ^ = 



where TR s bo , corresponds to a temporal spacing 
between the bottom field of the current image and 
the field of the past irnaye which is referenced by 
MV boc , TR D#boc corresponds to a temporal spacing 
between the bottom field of the future image and the 
field of the past image which is referenced by MV bot , 
and MV £# ko C is the forward motion vector for 
predicting the bottom field of the current image. 

11. The method of claim 10, wherein? 
said equation (a) is selected when a dele a 

motion vector MV D =0, and said equation -(b) is 

selected when MV D * 0. 

12 . A method for selecting a coding mode for a 
current predicted, field coded macroblock having top 



MV, 
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and bottom fields, in a sequence of digital video 
images , comprising the steps of : 

determining a forward sum of absolute 
differences error, SAD f03rwd#eield for the current 
mac rob lock relative to a past reference mac rob lock, 
which corresponds to a forward coding mcde; 

cetermining a backward sum of absolute 
differences error, SAD bftcHwardfield for the current 
macroblock relative to a future reference 
macroblock, which corresponds to a backward coding 
mode ; 

determining an average sum of absolute 
differences error, SAD averas<t>fiel<i for the current 
macroblock relative to an average of -said past and 
future reference macroblocks, whi-ch corresponds to 
an average coding mode ; and 

selecting said ceding mode according to the 
minimum of said SACs . 

13. The method cf claim 12, compri-sing the 
further step of: 

selecting said ceding mode according to the 
minimum of respective sums of said SAJDs with 
corresponding bias terms which aocount for the 
number of required motion vectors of the respective 
coding modes . 

14. The method cf claim 12 or 13, wherein: 
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3AD ferwo=rd . fic i d is determined according to a sum 
of:* (a) a sum of absolute differences for the top 
field of zhe current macroblock relative to a top 
field of -he past reference macrcblock, and (b) a 
sum of absolute differences for the bottom field of 
the current macroblock relative to a bottom field of 
the past refarence macroblock. 

15. The method of one of claims IS to 14 , 
wherein : 

SAD backward<field is determined according to a sum 
of: (a) a sum of absolute differences for the top 
field of the current macroblock relative to a -top 
field of the future reference macroblock, and (b) a 
sum of absolute differences for the bottom field of 
the current macroblock relative to a bottom field of 
the future reference macroblock. 

16. The method of one of claims 12 to 15 , 
wherein: 

SAD average> ;ield is determined according to a sum 
of: (a) a sum of absolute differences for the top 
field of the current macroblock relative to an 
average of the top fields of the past and futur-e 
reference macroblocks, and <b) a sum of absolute 
differences for the bottom field of the current 
macroblock relative to an average of the bottom 
fields of the past and future reference macroblocks . 
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17. A decoder for recovering a current, -direct 
mode, field codec macroblock having top and bottom 
fields in a sequence of digital video mac rob locks 
from a received bit stream, wherein said current 
macroblock is -bi-directionally predicted using a 
past field coded reference macroblock having top and 
bottom fields, and a future field coded reference 
macroblock having top and bottom fields, comprising: 

means for recovering MV top , a forward motion 
vector of the top field of the future macroblock 
which references one of the top and bottom fields of 
the past macroblock, and MV 5oc , a forward motion 
vector of the bottom field of the future macroblock 
which references one of the top and bottom fields of 
said past macroblock; and 

means for determining forward and backward 
motion vectors for predicting at least one of the 
top and bottom fields of the "current macroblock: by 
scaling the forward motion vector of the 
corresponding field of the future macroblock. 

18. The decoder of claim 17, further 
comprising : 

means for determining MV rt:o? , the forward Tnotior. 
vector for predicting the top field of the current 
macroblock, according to the expression 

. top + MV D ; 

where TR Bttop corresponds to a temporal spacing 
between the :op field of the current macroblock and 
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the field of the past macroblock which is referenced 
by WV cop , TR D<cop corresponds to a temporal spacing 
between the top field of the future macroblock and 
the field of the past macroblock which is referenced 
by MV top/ and MV D is a delta motion vector. 

19. The decoder of claim 18, wherein: 

MV Cttop is determined using integer division with 
truncation toward zero; and 

MV top and MV. TOt are integer half-luina pel motion 
vectors . 

20. The decoder of claim 18 or 19, wherein: 
TR s,to P and TR D top incorporate a temporal 

correction which accounts for whether said current 
field coded image is top field first or bottom field 
first . 

21. The decoder of one of claims 17 to 20, 
further compri sing : 

means for determining MV Jbot , the forward motion 
vector for predicting the bottom field of the 
current macroblock, according to the expression 
(MV^^TRs.^J/TRp.^^ MV D ; 

where TRg ^ corresponds to a temporal spacing 
between the bottom field of the current macroblock 
and the field of the past macroblock which is 
referenced by MV^ , TR Dbot corresponds to a temporal 
spacing between the bottom fieLd of the future 
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macroblock and. the field of the past macroblock 
which is referenced by MV bct , and MV C is a delta 
motion vector. 

22. The decoder of claim 21, wherein: 

MV f#bot is determined using integer division with 
truncation toward zero? and 

MV. op and MV^ are integer half-luma pel motion 
vectors . 

23. The decoder of claim 21 or 22, wherein : 
TRs.bot and TR Dfbot incorporate a temporal 

correction which accounts for whether said current 
field coded image is top field first or bottom field 
first. 

24. The decoder of one of claims 17 to 23, 
further comprising: 

means for determining MTV b - op , the backward 
motion vector for predicting the top field of the 
current macroblock, according to one of the 
equations (a) MV^ to? = (<TX B(Los ,-TR B(Lop ) *MV top ) /TR 9#cop and 
(b) MV b , eep = MVf.cop - MV top ; 

where TR B top corresponds to a temporal spacing 
between the top field of the current macroblock and 
the field of the past macroblock which is referenced 
by MV top , TR = ., r . 0? corresponds to a temporal spacing - 
between the top field of the future macroblock and 
the field of the past macroblock which is referenced 
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by MV co? , end MV £(top is the forward motion vector for 
predicting the top field of the current macroblock. 

25. The decoder of claim 24, further 
comprising : 

means for selecting said equation (a) when a 
delta motion vector MTV 3 =0 ; and 

means for selecting said equation (b) when 

MV D ? 0 . 

26. The decoder of one of claims 17 to 24, 
further comprising: 

means for determining. MV b bot , the backward 
motion vector for predicting the bottom field of the 
current macroblock, according to one of the 
equations (a) MV b bot = ( (TR B(boc -TR 3>bot } *MV bot ) /TR D#bot ar.d 
lb) MV fc ^ oc = MV ; . boc - MV bot ; 

where TRo, boc corresponds to a temporal spacing 
between the bottom field of the current macroblock 
and the field of the past macroblock which is 
referenced by MV bct/ TR c#fcot corresponds to a temporal 
spacing between the bottom field of the future 
macroblock and the field of the past macroblock 
which is referenced by MV bot , and MV f bocl is the 
forward motion vector for predicting the bottom 
field of the current nacrcblock. 
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27. The decoder of claim 26, further 
comprising : 

reeans for selecting said equation (a) when a 
delta motion vector MV D =0; and 

means for selecting said equation (b) when 

MV n * C . 
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3. Detailed Description of Invention 



BACK GROUND OF THE INVENTION 



The present invention provides a method and 
5 apparatus for coding of digital video images such as 

bi-directionally predicted video object planes <B- 
VCPs) , in particular, where the B-VCP and/or a 
reference image used to code the B-VOP is interlaced 
coded . 

iO The invention is particularly suitable for use 

with various multimedia applications, and is 
compatible with the MPEG-4 Verification Model <VM) 
8.0 standard (MPEG-4 VM 8.0) described in document 
ISO/IEC/JTC1/SC29/WG11 NL796, entitled "MPEG-4 Video 

i5 Verification Model Version B.0", Stockholm, July 

1997, incorporated herein by reference. The MPEG-2 
standard is a precursor to the MPEG-4 standard, and 
is described in document ISO/IEC 13818-2, entitled 
"Information Technology - Generic Coding of Moving 

20 Pictures and Associated Audio, Recommendation 

H.262," March 25, 1S94, incorporated herein by 
reference. 

MPEG-4 is a coding standard which provides a 
flexible framework and an open set of coding tools 
25 for communication, access, and manipulation of 

digital audio-visual data. These tools support a 
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wide range of features. The flexible framework -of 
MPEG- 4 supports various combinations of coding tools 
and their corresponding functionalities for 
application required by the computer, 
5 telecommunication, and entertainment (i.e., TV and 

film) industries, such as database browsing, 
information retrieval, and interactive 
communications. 

MPEG- 4 provides standardized core technologies 

10 allowing efficient storage, transcission and 

manipulation of video data in multimedia 
environments. MPEG-4 achieves efficient 
compression, object scalability, spatial and 
temporal scalability, and error resilience. 

15 The MPEG-4 vidso VM coder/decoder -codec} is c. 

block- and object-based hybrid coder with motion 
compensation . Texture is encoded with an 6x8 
Discrete Cosine Transformation (DCT) utilising 
overlapped block-motion compensation. Object shapes 

2 0 are represented as alpha maps and encoded using a 

Content -based Arithmetic Encoding (CAE) algorithm or 
a modified DCT coder, both using temporal 
prediction. The coder can handle sprites as they 
are known from computer graphics. Other coding 
25 methods, such as wavelet and sprite coding, may al-so 

be usee for special applications. 

Motion compensated texture coding is a well 
known approach for video coding, and can be modeled 
as a three-stage process. The first stage is signal 

3 0 processing which includes motion estimation and. 

compensation (M3/MC) and a two-dimensional (2-D) 
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spatial transformation. The objective of ME/MC and 
the spatial transf cmaation is to take advantage of 
temporal and spatial correlations in a video 
sequence to optimize the rate-distortion performance 
5 of quantization and entropy coding under a 

complexity constraint. The most common technique 
for ME/MC has been block matching, and the most 
common spatial transformation has been the DOT. 

However, special concerns arise for ME/MC of 

1C macroblocks (MBs) in B-vOPs when the ME is it-self 

interlaced coded and/or uses reference image s which 
are interlaced coded. 

In particular, it would be desirable to have an 
efficient technique for providing motion vector <MV) 

15 predictors for a MB in a B-VCP. It would also be 

desirable to have an efficient technique for -direct 
mode ceding of a field coded ME in a B-VCP. It 
would further be desirable :c have a ceding mode 
decision process for a MB in a field coded B-VOP for 

20 selecting the reference image which is -results in 

the nost efficient coding. 

The present invention provides a system having 
the above and other advantages . 
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SUMMARY OF THE INVENTION 

In accordance with the present invention, a 
method and apparatus are presented for coding of 
digital video images such as a current image (e.g., 
5 macroblockj in a bi-directionally predicted video 

object plane (B-VOP) , in particular, where the 
current image and/or a reference image \ised to code 
the current image is interlaced (e.g., field) coded. 
In a first aspect of the invention, a .method 

10 provides direct mode motion vectors (MVs) for a 

current bi-directionally predicted, field -codec image 
such as a macroblock (MB) having top and bottom 
fields, in a sequence of digital video images. A 
past field -coded reference image having top and 

15 bottom fields, and a future field coded reference 

image having top and bottom fields are determined. 
The future image is predicted using the past image 
such that WV ccp/ a forward MV of the top field of the 

future image, references either the top cr bottom 
20 field of said past image. The field which is 

referenced contains a best -match MB for a MB in the 

top field of the future image . 

This MV is termed a "forward" MY since, although 

it references a past image {-e.g.; backward in time), 
25 the prediction is from the past image to the future 

image, e.g., forward in time. As a mnemonic, the 

prediction direction may be thought of as being 

opposite the direction of the corresponding MV. 
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Similarly, MV bot , a forward motion vector of the 

bottom field of the future image, references either 
the top or bottom field of the past image. Forward 
and backward KVs are determined for predicting the 
5 top and/or bottom fields of the current image by- 

scaling the forward MV of the corresponding field of 
the future image . 

In particular, MV f top , the forward motion vector 

for predicting the top field of the current image, is 
10 determined according to the expression MV f#top = JMV tCF 

* TR 3,t© P ) / tr d. top + mv d* where MV D is a delta motion 
vector for a search area, TR 3#top corresponds to a 

temporal spacing between the top field of the current 
image and the field of the past image which is 
15 referenced by NV top/ and TR D/ top corresponds to a 

temporal spacing between the top field of the future 
image and the field of the past image which is 
referenced by MV top . The temporal spacing may be 

related to a frame rate at which the images are 
2 0 displayed. 

Similarly, MV f tot , the forward motion vector for 

predicting the bottom field of the current image, is 
determined according to the expression MV f#bot = (MV^ 

*TR 3 /boc ) /TR D bot + MV 0 , where MV D is a delta motion 
25 vector, TR B bot corresponds to a temporal spacing 

between the bet torn field of the current image and the 
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field, of che past image which is referenced by MV bct , 
and TR D bct corresponds to a temporal spacing between 
the bottom field of the future MB and the field of 
the past MB which is referenced by MV tooe . 

5 MV b top/ the backward motion vector for 

predicting the top field of the current MB is 
determined according to the equation MV fc top = 

( (TR B(top -TR Eftop ) *MV. OF > /TR Dftop when the delta, motion 
vector MV n =0, or MV b re5p = MV F ,t=« " v;h °n MV D *0 . 

10 wv bjboc , the backward motion vector for 

predicting the bottom field of the current MB is 
determined according to the equation MV b>bot = 

< CTR B(boc -TR^ Sot )*MV l!)OC )/TR 1 , (bo:: when the delta motion 

vector mv d =o, ox MV b /bot = MV £>bo . - ^v buL when MV D *0. 

15 A corresponding decoder is also presented. 

Zn another aspect of the invention, a method is 
presented for selecting a coding node for a current 
predicted, field coded MB having top and bottom 
fields, in a sequence ot digital video MBs . The 

2 0 coding mode may be a backward mcde, where the 

reference MB is temporally after the current MS in 
display order, a forward node, where the reference MB 
is before the current MB, or average (e.g., bi- 
directional) mode, where an average of prior and 

25 subsequent reference MBs is used. 
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The method includes the step of determining a 
forward sum of absolute differences error, 
SAD f orward> fi e i d for the current MB relative -to a past 
reference MB, which corresponds to a forward coding 
5 mode. SAD forward field indicates the error in pixel 

luminance values between the current MB and a best 
match MB in the past reference MB. A backward sum of 
absolute differences error, SAD bac3cufaird f ieid for the 

current MB relative to a future reference MB, which 
10 corresponds to a backward coding mode is also 

determined. SAD backward f ield indicates the error in 

pixel luminance values between the current MB and a 
best match MB in the future . reference MB. 

An average sum of absolute differences error, 
15 SAD averagC( f^eid for the current MB relative to an 

average of the past and future reference MBs r which 
corresponds to an average coding mode, is also 
determined. SAD average/ field indicates the error in 

pixel luminance values between the current MB and a 
"20 MB which is the average of the best match MBs of the 

past and future reference MBs . 

The coding mode is selected according to the 
minimum of the SADs . Bias terms which account f or 
the number of required MVs of the respective coding 
25 modes may also be factored into the coding mode 

selection process . 
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SAD forward, field' ^ AD bac)cward, Cield ' an< * S ^average , Cieia 

are determined by -summing the component terms over 
the top and bottom fields. 
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. . DETAILED DESCRIPTION OF TEE INVENTION 

A method and apparatus are presented for coding 
of a digital video image such as a macroblock (MB) 
in a bi-directionally predicted video object plane 
5 (B-VOP) , in particular, where the MB and/or a 

reference image used to code the MB is interlaced 
coded. The scheme provides a method for selecting a 
prediction motion vector <PMV> for the top and 
bottom field of a field coded current MB, including 

10 forward and backward PMVs as required, as well as 

for frame coded MEs . A direct -coding mode for a 
field coded MB is also presented, in addition to a 
coding decision process which uses the minimum of 
sum of absolute differences terms to select an 

15 optimum mode. 

FIG. l is an illustration of a video object 
plane (VOP) coding and decoding process in 
accordance with the present invention. Frame 1C5 
includes three pictox'ial elements, including a 

20 square foreground element 107, an oblong foreground 

element 108, and a landscape backdrop element 109. 
In frame 115, :he elements are designated VOPs using 
a segmentation mask such that VOP 117 represents the 
square foreground element 107, VOP 118 represents 

25 the oblong foreground element 108, and VOP 119 

represents the landscape backdrop element 109 . A 
VOP can have an arbitrary shape, and a succession of 
VOPs is known as a video object. A full rectangular 
video frame may also be considered to be a VOP . 
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Thus,. -he term "VOP" will be used herein to indicate 
both arbitrary and non- arbitrary (e.g., rectangular) 
image area shapes. A segmentation mask is obtained 
using known techniques, and has a format similar to 
5 that of ITU-R 601 luminance da-a . Each pixel is 

identified as belonging to a certain region in the 
video frame. 

The frame 105 and VOP data from frame 115 are 
supplied to separate encoding functions. In 

10 particular, VOPs 117, 118 and 119 undergo shape, 

motion and texture encoding at encoders 13 7, 138 and 
13 9, respectively. With shape coding, binary and 
gray scale shape information is encoded. With 
motion coding, the shape information is -coded using 

IS motion estimation within a frame. With texture 

coding, a spatial transformation such as the OCT is 
performed to obtain transform coefficients which can 
be variable-length coded for compression. 

The coded VOP data is then combined at a 

20 multiplexer (MUX) 140 for transmission over a 

channel 145. Alternatively, the data may be stored 
on a recording medium. The received coded VOP data 
is separated by a demultiplexer (DEMUX) ISO so that 
the separate VOPs 117-119 arc decoded and recovered. 

25 Frames 155. 165 and 175 show that VOPs 117, 118 and 

119, respectively, have been decoded and recovered 
and can therefore be individually manipulated using 
a compositor 160 which interfaces with a video 
library 17 0, for example. 

30 The compositor may be a device such as a 

personal computer which is located at a user's home 
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to allow che user to edit the received data to 
provide a customized image. For example, the user's 
personal video library 170 may include a previously 
stored VOP 178 (e.g., a -circle) which is different 
5 than the received VOPs . The user may compose a 

frame 185 where the circular VOP 178 replaces the 
square VOP 117 . The frame 185 thus includes the 
received VOPs 118 and 119 and the locally stored VOP 
178 . 

10 In another example, the background VOP 109 may 

be replaced by a background of the user's choosing. 
For example, when viewing a television news 
broadcast, the announcer may be coded as a VOP which 
is separate from the background, such as a news 

15 studio. The user may select a background from the 

library 170 or from another television program, such 
as a channel with stock price or weather 
information. The user can therefore act as a video 
editor . 

20 The video library 170 may also -store VOPs which 

are received via the channel 145, and may access 
VOPs and other image elements via a network such as 
the Internet. Generally, a video session comprises 
a single VOP, or a sequence of VOPs. 

25 The video object coding and decoding process of 

FIG. 1 enables many entertainment, business and 
educational applications, including personal 
computer games, virtual environments, graphical user 
interfaces , videoconferencing, Internet applications 

30 and the like. In particular, the capability for 

ME/MC with interlaced coded (e.g., field mode) VOPs 
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in accordance with the present invention provicLes 
even greater capabilities . 

FIG. 2 is a block diagram of an encoder in 
accordance with the present invention. The encoder 
5 is suitable for use with both predictive -coded VOPs 

(P-VOPs) and bi-directionally coded VOPs (B-VOPs) . ■ 

P-VDPs may include a number of macroblocks 
(MBs) which may be coded individually using an 
intra- frame mode or an inter- frame mods. with 

10 intra- frame (INTRA) coding, .the macroblock (MS) is 

coded without reference to another MB. With inter- 
frame (INTER) coding, the MB is differentially coded 
with respect to a temporally subsequent frame in a 
mode known as forward prediction. The -temporally 

15 subsequent frame is known as an anchor frame or 

reference frame. The anchor frame {-e.g., VOP) must: 
be a P-VOP or an l-VOP f not a B-VCP . An I-VOF 
includes self-contained (e.g., intra-coded) blocks 
which are not predictive coded. 

2 0 With forward prediction, the current MB is 

compared to a search area of MBs in the anchor frame 
to determine the best match. A corresponding motion 
vector (MV) , known as a backward MV, describes the 
displacement of" the current MB relative to the best 

2 5 match MB. Additionally, an advanced predict icn mode 

for P-VOPs may be used, where motion compensation is 
performed on 8x8 blocks rather than 16x16 MBs . 
Moreover, both intra -frame and inter- frame coded P- 
VOP MBs can be coded in a frame mode or a field 

3 0 mode. 
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. B-VOPs can use the forward prediction mode as 

described above in connection with P-VOPs as well as 

backward prediction, bi-directional prediction, and 

direct mode, which are all inter-frame techniques. 

5 B-VOPs do not currently use intra-frame coded MBs 

under MPEG-4 VM 8.0, although this is subject to 

change. The anchor frame (e.g., VOP) must be a P- 

VOP or I-VOP, not a B-VOP. 

With backward prediction of B-VOPs, the current 
■ 

10 MB is compared to a search area of MBs in a 

temporally previous anchor frame to determine the 
best match. A corresponding MV, known as a forward 
MV) , describes the relative displacement of the 
current MB relative to the best match MB. with bi- 

15 directional prediction of a B-VOP MB, the current MB 

is compared to a search area of MBs in both a 
temporally previous anchor frame and a temporally 
subsequent anchor frame to determine the best match 
MBs. Forward and backward MVs describe the 

2 0 displacement of the current MB relative to the best 

match MBs. Additionally, an averaged image is 
obtained frDm the best match MBs for use in encoding 
the current MB . 

With direct mode prediction of B-VOPs, a MV is 

2 5 derived for an 8x8 block when the collocated MB in 

the following P-VOP uses the 8x8 advanced prediction 
mode. The MV of the 8x8 block in the P-VOP is 
linearly scaled to derive a MV for the block in the 
B-VOP without the need for searching to find a best 

3 0 match block. 
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The encoder, shown generally at 200 , includes a 
shape coder 210, a motion estimation function 220, a 
motion compensation function 230, and a texture 
coder 240, which each receive video pixel data input 
at terminal 205. The motion estimation function 
220, motion compensation function 230, texture coder 
24 0, and shape coder 210 also receive VOP shape 
information input at terminal 207, such as the M PEG - 
4 parameter VOP_of _arbitrary_shape . When this 
parameter is zero, the VOP has a rectangular shape, 
and the shape coder 210 therefore is not used. 

A reconstructed anchor VOP function 250 
provides a reconstructed anchor VOP for use by the 
motion estimation function 220 and motion 
compensation function 230 . A current VOP is 
subtracted from a motion compensated previous VOP at 
subtractor 260 to provide a residue which is encoded 
at the texture coder 240. The texture coder 240 
performs the DCT to provide texture information 
(e.g., transform coefficients) to a multiplexer 
(MUX) 280. The texture coder 240 also provides 
information which is summed with the output from the 
motion compensator 2 30 at a summer 2 70 for input to 
the previous reconstructed VOP function 250 . 

Motion information (e.g., motion vectors) is 
provided from the motion estimation function 220 to 
the MUX 28 0, while shape information which indicates 
the shape of the VOP is provided from the shape 
coding function 210 to the MUX 280. The -MUX 280 
provides a corresponding multiplexed data stream to 
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a buffer 2 90 for subsequent conimunicaticn over a 
daca channel. 

The pixel data which is input to the encoder 
may have a YUV 4:2:0 format. The VOP is represented 
5 by means of a bounding rectangle. The top left 

coordinate of the bounding rectangle is rounded to 
the nearest even number not greater -han the top 
left coordinates of the tightest rectangle. 
Accordingly, the top left coordinate of the bounding 

10 rectangle in the chrominance component is one-half 

that of the luminance component . 

FIG. 3 illustrates an interpolation scheme for 
a h3lf -pixel search. Motion estimation and motion 
compensation (ME/MC) generally involve matching a 

15 block of a current video frame (e.g., a current 

block) with a block in a search area of a reference 
frame (e.g., a predicted block or reference block) . 
For predictive (P) coded images, the reference block 
is in a previous frame. For bi-directionally 

2 0 predicted (B} coded images, predicted blocks in 

previous and subsequent frames may be used. The 
displacement of the predicted block relative to the 
current block is the motion vector -(MV) , which has 
horizontal (x) and vertical (y) components. 

2 5 Positive values of zhe MV components indicate that 

the predicted block is to the right of, and below, 
the current block. 

A motion compensated difference block is formed 
by subtracting the pixel values of the predicted 

3 0 block from those of the current block point by 

point. Texture coding is then performed on the 
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difference block. The codec MV and the coded 
texture information of the difference block: are 
. transmitted to the decoder. The decoder can then 
reconstruct ar. approximated current block by adding 
5 the quantized difference block to the predicted 

block according to the MV. Tt.e block for ME/MC can 
be a 16x16 frame block (macroblock) , an 8xS block or 
a 16x8 field block. 

Accuracy of the MV is get at half -pixel. 

10 Interpolation must be used on the anchor frame so 

that p(/ + x./+j0 is defined for x or y being half of an 
integer. Interpolation is performed as shown in 
FIG. 3. Integer pixel positions are represented by. 
the symbol , as shown at A, B, C and D. Half- 

15 pixel positions are indicated by circles, as shown 

at a, b r c and d. As seen, a - A, b (A + B)//2 
c « (A + C)//2, and d = (A + B + C + D)//4, where 
"//" denotes rounded division. Further details of 
the interpolation are discussed in MPEG-4 VM 8.0 

20 referred to previously as well as commonly assigned 

U.S. Patent application Serial No. 08/897,847 to 
Eifrig et al . , filed July 21, 1997, entitled "Motion 
Estimation and Compensation of Video Object Planes 
for Interlaced Digital Video", incorporated herein 

25 by reference . 

FIG. o illustrates reordering of pixel lines ir. 
ar. adaptive frame /field prediction scheme in 
accordance with zhe present invention. In a first 
aspect of the advanced prediction technique, an 

30 adaptive technique is used to decide whether a 
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current macroblock {MB) of 16x16 pixels should be 
ME/MC coded as is, or divided into four -blocks of 
8x8 pixels each, where each 8x8 block is ME/MC coded 
separately, or whether field based motion estimation 
5- should be used, where pixel lines of the MB are 

reordered to group the same -field lines in two 16x8 
field blocks, and each 16x8 block is separately 
ME/MC coded. 

A field mode 16x16 macroblock (MB) , is shown 

10 generally at 600. The MB includes even - numbered 

lines 602, 604, 606, 608, 610, 612, 614 and 616, and 
odd-numbered lines 603 , 605 f 607, 609, 611, 613, 615 
and 617. The even and odd lines are thus 
interleaved, and form top and bottom (or first and 

15 second) fields, respectively. 

When the pixel lines in image 60 0 are permuted 
to form same- field luminance blocks, the MB shown 
generally at 65 0 is formed. Arrows, shown generally 
at 645, indicate the reordering of the lines 602- 

20 617. For example, the even line 602, which is the 

first line of MB 600, is also the first line of MB 
650. The even line 604 is reordered as the second 
line in MB 650. Similarly, the even lines 606, 608, 
610, 612, 614 and 616 are reordered as the third 

2 5 through eighth lines, respectively, of MB 650. 

Thus, a 16x8 luminance region 680 with even- numbered 
lines is formed. Similarly, the odd -numbered lines 
603, 605, 607, 609, 611, 613, 615 and 617 form a 
16x8 region 685. 

30 The decision process for choosing the MC mode 

for P-VOPs is as follows. For frame mode video, 
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first.obcain the Sum of Absolute Differences (SAD) 
for a single 16x16 ME, e.g., SAD X6 iMV x ,\JV y ); and for 
four 9x8 blocks, e.g., 

SAD, (h4V TX , M/ yv ), SAD, (MV XJ , MV y2 J, SAD, (MV^ , MV y3 ) t and SAD 9 (MV K < , MV^ ). 

£ If X^^^J <SADn(MV s . K4V y )-\29, choose 8x6 

prediction; otherwise, choose 16x16 prediction. The 
constant "129" is ohra.ir.se fro™ Nb/2+i, where Nb is 
the number of- non- transparent pixels in a MB. 
?or interlaced video, obtain 
10 SAD^OV/V, . a ,),^0 4aw ,<W 1 ^.hVK^), where (W, ^) 

and W^X/^^) are the motion vector (MV) for 
both top (even) and bottom (odd) fields. Then, 
choose the reference field which has the smallest 
SAD (e.g., for SAD. op and SAD^,.,.^) from, the field 

15 half sample search. 

The overall prediction mode decision is based 
or. choosing the minimum of: 

4 

(a) SAD K6 (MV xl A4V y ) , (b) ^SAD Z ( MV^ ! MV yi ) H29, 

and (c) SAD np (KdV XXOf . W yJop ) + SAD^J \fl\_ buu>m . MV y>orom J + 6S . 

20 If term (a) is the minimum, 1SxI6 prediction is 

used. If term <b} is zhe minimum, 8x8 motion 
compensation (advanced prediction mode) is used. If 
term (c) is the minimum, field based motion 
estimation is used. The constant "-65" is obtained 

25 from Mb/4+1 . 

If 6x£ prediction is chosen, there are four MVs 
for the four 8x8 luminance blocks, i.e., one MV for 
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each 8x8 block. The MV for the two chrominance 
blocks is then obtained by taking an average of 
these four mvs and dividing the average value by 
two. Since each MV for the 8x8 luminance block has 
5 a half -pixel accuracy, the MV for the chrominance 

blocks may have a sixteenth pixel value. Table l, 
below, specifies the conversion of a sixteenth pixel 
value to a half -pixel value for chrominance MVs . 
For example, 0 through 2/16 are rounded to 0, 3/16 
10 through 13/16* are rounded to 1/2 , and 14/16 and 

15/16 are rounded to 2/2=1. 



Table l 



1/16 

pixel 

value 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


1/2 

pixel 

value 


0 


0 


0 


1 


1 


1 


1 


1 


1 


1 


• 1 


1 


1 


1 


2 


2 



With field prediction, there are two MVs for 
the two 16x8 blocks. The luminance prediction is 

15 generated as follows. The even lines of the MB 

(e.g., lines 602, 604, 606, 608, 610, €12, 614 and 
616) are defined by the top field MV using the 
reference f ield * speci f ied . The MV is specified in 
frame coordinates such that full pixel vertical 

2 0 displacements correspond to even integral values of 

the vertical MV coordinate, and a half-pixel 
vertical displacement is denoted by odd integral 
values. When a half -pixel vertical offset is 
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specified, only pixels from lines within the same 
reference field are combined. 

The MV for the two chrominance blocks is 
derived from the (luminance) MV by dividing -each 
5 component by two, then rounding. The horizontal 

component 1 is rounded by mapping all fractional 
values into a half -pixel offset. The vertical MV 
component: is an integer and the resulting 
chrominance MV vertical component is rounded to an 
10 integer. If 'the result of dividing by two yields a 

non- integral value, it is rounded to the adjacent 
odd integer. Note that the odd integral values 
denote vertical interpolation between lines of the 
same field. 

15 The second aspect of the advanced prediction 

technique is overlapped MC for luminance blocks, 
discussed in greater detail in MPEG-4 vm 8.0 and 
Eifrig et al . application referred to previously. 

Specific coding techniques for B-VOPs are now 

20 discussed. For INTER coded VOPs such as B-VOPs, 

there are four prediction modes, namely, direct 
mode, interpolate (e.g., averaged or bi-directional) 
mode, backward mode, and forward mode. The latter 
three modes are . non-direct modes. Forward only, or 

2 5 backward only prediction are also known as 

"unidirectional" prediction. The predicted blocks 
of the B-VOP are determined differently for each 
mode. Furthermore, blocks of a B-VOP and the anchor 
block(s) may be progressive (e.g., frame) coded or 

30 interlaced (e.g., field) coded. 
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A single B-VOP can have different MBs which are 
predicted with different modes. The term "B-VOP" 
only indicates that bi-directionally predicted 
blocks may be included, but this is no: required. 
5 In contrast, with P-VO?s and I-VOPs, bi- 

directionally predicted MBs are not used. 

For non-direct mode B-VOP MBs , MVs are ceded 
differentially. For forward MVs in forward and bi- 
directional modes, and backward MVs in backward and 

10 bi-directional nodes, the r ' same -type" MV (e.g., 

forward or backward) of the MB which immediately 
precedes the current MB in the same row is used as a 
predictor. This is the same as the immediately 
preceding MB in raster order, and generally, in 

15 transmission order. However, if the raster order 

differs from the transmission order, the MVs of the 
immediately preceding MB in transmission order 
should be used to avoid the need to store and re- 
order the MBs and corresponding MVs at the decoder. 

20 Using the same-type MV, and assuming the 

transmission order is the same as the raster order, 
and that the raster order is from left to right, top 
to bottom, the forward MV of the left-neighboring MB 
is used as a predictor for the forward MV of the 

25 current MB of the B-VOP. Similarly, the backward MV 

of the left -neighboring M3 is used as a predictor 
for the backward MV of the current MB of the B-VOP. 
The MVs of the curren: MB are then differentially 
encoded using the predictors. That is, the 

30 difference between the predictor and the MV which is 

determined for the current ME is transmitted as a 
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motion vector difference to a decoder. At the 
decoder, the MV of the current MB is determined by- 
recovering and adding the PMV and the difference MV . 
In case the current MB is located on the left 
5 edge of the VOP, the predictor for the current MB is 

set to zero. 

For interlaced-coded B-VOPs, each of the top 
and bottom fields have two associated prediction 
motion vectors, for a total of four MVs. 

10 The four prediction MVs represent, in transmission 

order, the top field forward and bottom field 
forward of the previous anchor MB, and the top field 
backward and bottom field backward of the next 
anchor MB. The current MB and the forward MB, 

15 and/or the current MB and the backward MB, may be 

separated by one or more intermediate images which 
are not used for ME/MC coding of the current MB. B- 
voPs do not contain INTRA coded MBs , so each MB in 
the B-VOP will be ME/MC coded. The forward and 

2 0 backward anchor MBs may be from a P-VOP or I -VOP, 

and may be frame or field coded. 

For interlaced, non-direct mode B-VOP MBs, four 
possible prediction motion vectors (PMVs) are shown 
in Table 2 below. The first column of Table 2 shows 

25 the prediction function, while the second column 

shows a designator for the PMV. These PMVs are used 
as shown in Table 3 below for the different MB 
prediction modes . 
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Table 2 



Prediction function 


PMV type 1 


Top field, forward 


C 


Bottom field, forward 


1 


Top field, backward 


2 


Bottom field, backward 


3 



Table 3 



Macroblock mode 


PMV type used 


Frame, . forward 


0 


Frame,, backward 


2 


Frame , bi - direct ional 


0,2 


Field., forward 


0,1 


Field, backward 


2,3 


Field, bi-directional 


0,1,2,3 



For example, Table 3 shows that, for a current 
field mode MB with a forward prediction mode (e.g., 
5 "Field, forward"), top field forward ("0") and 

bottom field forward ("1") motion vector predictors 
are used. 

After being used in differential coding, the 
motion vectors of a current MB become the ?MVs for a 

10 subsequent MB, in transmission order. The PMVs are 

resec to zero at the beginning of each row of MEs 
since the MVs of a MB at the end of a preceding row 
are unlikely to be similar to the NVs of a KB at the 
beginning of a current row. The predictors are also 

15 not used for direct mode MBs . For skipped KBs, the 

PMVs retain the last value . 

With direct mode coding of B-VOP MBs , no vector 
differences are transmitted. Instead, the forward 
and backward MVs are directly computed a: the 



<52) 



temW- 11-75191 



decoder from the MVs of the temporally next P-VOP 
MB, with, correction by a single delta MV, which is 
not predicted. The technique is efficient since 
less MV data is transmitted. 
5 Table 4 below summarizes which PMVs are used to 

code the motion vectors of the current B-VOP MB 
based on the previous and current MB types . For B- 
VOPs, an array of prediction motion vectors, pmv [ ] 
may be provided which are indexed from zero .to three 

10 (e.g., pmv[0] # , pmv[l], pmv [2] and pmv[3]). The 

indexes pmv [ ] are not transmitted, but the decoder 
can determine the pmv[ ] index to use according to 
the MV coding type and the particular vector being 
decoded. After coding a B-VOP MB, some of the PMVs 

15 vectors are updated to be the same as the motion 

vectors of the current MB. The first one, two or 
four PMVs are updated depending on the number of MVs 
associated with the current MB. 

For example, a forward, field predicted MB has 

20 two motion vectors, where pmv[0] is the PMV for the 

top field, forward, and pmv[i] is the PMV for the 
bottom field, forward. For a backward, field 
predicted MB, pmv [2] is the PMV for the top field 
backward, and pmv [3] is the PMV for the bottom 

25 field, backward. For a bi-directional, field 

predicted MB, pmv(O) is the PMV for the top field, 
forward, pmv[i] is the PMV for the bottom field 
forward, pmv [2] is the PMV for the top field 
backward, and pmv [3] is the PMV for che bottom field 

3 0 backward. For a forward or backward predicted frame 

mode B-VOP MB , there is only one MV, so only pmv CO] 
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is used for forward, and pmv[2j is used for 
backwax-d. For an average (e.g., bi-directionally) 
predicted frame mode B-VOP MB, there are two MVs, 
namely, pmv[0] for the forward MV, and pmv[2] for 
5 the backward MV. The row designated M pmv[ ] 1 s to 

update' 1 indicates whether one, two or four MVs are 
updated. 
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It will fce appreciated that Table 4 is merely a 
shorthand notation for implementing the technique of 
the present invention for selecting a prediction MV 
5 for a current MB. However, the scheme may be 

expressed in various other ways. 

Intra block DC adaptive prediction can use the 
same algorithm as described in MPEG-4 VM 8/0 
regardless of value of dct_type . Intra block 

10 adaptive AC prediction is performed as described in 

MPEG-4 VM 8.0 except when "he first rcw of 
coefficients is to be ccpied from the ceded block 
above. This operation is allowed only if dcttype 
has the same value for the current MB and the block 

15 above. If the dct_types differ, then AC prediction 

can occur only by copying the first column from the 
block to the left. If there is no left block, zero 
is used fcr -he AC predictors. 

FIG . 4 illustrates direct mode ceding of the 

20 toy field of an in-erlaced-coded B-vop in accordance 

with the present invention. Progressive direct 
coding mode is used for the current macroblock (MB) 
whenever the MB in a future anchor picture which is 
at the same relative position (e.g., co-sited) as 

25 the current MB is coded as (1) a 16x16 (frame) MB, 

(2) an intra MB or (3) an 3x3 (advanced prediction) 
MB. 

The direct mode prediction is interlaced 
whenever the co-sited -future anchor -picture MB is 
3 0 coded as an interlaced MB. Direct mode will *fae used 

to code the current MB if its biased SAD is the 
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10 



15 



20 



25 



minimum of all B-VOP MB predictors . Direct mode for 
an interlaced coded MB forms the prediction MB 
separately for the top and bottom fields of the 
current MB. The four field motion vectors (MVs) of 
a bi-directional field motion compensated MB (e.g., 
cop field forward, bottom field forward, top field 
backward, and bottom field backward) are calculated 
directly from the respective MVs of the 
corresponding MB of the future anchor picture . 

The technique is efficient since the required 
searching is significantly reduced, and the amount 
of transmitted MV data is reduced. Once the MVs and 
reference field are determined, the current MB is 
considered to be a bi-directional field predicted 
MB. Only one delta motion vector (used for both 
fields) occurs in the bitstream for the field 
predicted MB. 

The prediction for the top field of the current 
MB is based on the top field MV of the MB of the 
future anchor picture (which can be a P-VOP, or an 
I-VOP with MV=0) , and a past reference field of a 
previous anchor picture which is selected by the 
corresponding MV of the top field of the future 
anchor MB. That is, the top field MB of the future 
anchor picture which is correspondingly positioned 
(e.g., co-sited) to the current MB has a best match 
MB in either the top or bottom field of the past 
anchor picture. This best match MB is then used as 
the anchor MB for the top field of the <rurrent MB. 
An exhaustive search is used to ^determine the -delta 
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motion. vector MV, given the co-sited future anchor 
MV on a MB by MB basis. 

Motion vectors for the bottom field of the 
current MB are similarly determined using the MV of 
the correspondingly positioned bottom field of the 
future anchor MB, which in turn references a best 
match MB in the top or bottom field of the past 
anchor picture . 

Essentially, the top field motion vector is 
used to construct an MB predictor which is the 
average of (a) pixels obtained from the top field of 
the correspondingly positioned future -anchor MB and 
(b) pixels from the past anchor field referenced by 
the top field MV of the correspondingly positioned 
future anchor MB- Similarly, the bottom field 
motion vector is used to construct a MB predictor 
which is the average of (a) pixels obtained from the 
bottom field of the correspondingly positioned 
future anchor MB and (b) pixels from the past anchor 
field referenced by the bottom field MV of the 
correspondingly positioned future anchor M3 . 

AS shown in FIG. 4, the current B-VOP MB 4 20 
includes a top field 430 and bottom field 425, the 
past anchcr VOP - MB 400 includes a top field 410 and 
bottom field 405. and the future anchor VOP MB 440 
includes a top field 450 and bottom field 445. 

Tlie motion vector MV top is the forward motion 
vector for the top field 450 of the future anchor MB 
440 which indicates the best match MB in -the past 
anchor M3 400, Even though MVl op is referencing a 
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previous image (e.g., backward in time), it is a 
forward MV since the future anchor VOP 440 is 
forward in time relative to the past anchor VOP 400 . 
In the example, MV top references the bottom field 

5 405 cf che past anchor MB 400, although either the 

top 410 or bottom 40S field could be referenced, 
^f.top ie the forward MV of the top field of the 
current MB, and MV b tcp is the backward MV of the top 

field of the current MB. Pixel data is derived for 
10 the bi-directionally predicted MB at a decoder by 

averaging the pixel data in the future and :pas-t 
anchor images which are identified by MV b tcp and 

M^top* respectively, and summing the averaged image 
v/ith a residue which was transmitted. 
15 The motion vectors for the top field are 

calculated as follows: 

MV £ , top = (TR B<top *MV top ) /TR D , to? +MV D ; 

MVb.top = (<TR E . top - TR D|tcp ) * MV top )/ TR D , X 
if MV D =0; and 

20 MV b.top = (^f.,op - MV t op) MV D *0. 

MV D is a delta, or offset, motion vector. Note that 
the motion vectors are two-dimensional. 
Additionally, the motion vectors are integral half- 
pixel luma motion vectors. The slash "/" -denotes 

25 truncate toward zero integer division. Also, the 

future anchor VOP is always a P-VOP for field direct 
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mode. If the future anchor wa-s an I-VOP, the MV 
would be zero and 16x16 progressive -direct mode 
would be used. TR B to? is the temporal distance in 

fields between the past reference field (e.g., top 
5 cr bcttoir.) , which is the bottom field 405 in this 

example, and the top field 430 of the current B-VOP 
420. TR D top is the temporal distance between the 

pas- reference field (e.g., top or bottom), which is 
the bottom fi-eld 405 in this example , end the future 

10 top reference field 450. 

FIG. 5 illustrates direct mode coding of the 
bot-om field of an interlaced-coded B-VOP in 
accordance with the present invention. Note that 
the source interlaced video can have a top field 

15 first or bottom field first t orrnat . a bottom field 

first format is shown in FH3s 4 and 5. Like- 
numbered elements are the same as in FIG . 4. Here, 
the motion vector MV fcot is the forward motion vector 

for the bottom field 445 of the future anchor 
2 0 macroblock (MB) 44 0 which indicates the best match 

MB in the past anchor MB 400. In the example, MV^ 

references the bottom field 405 of the past anchor 
MB 400, although either the top 410 or bottom 405 
field could be used. MV E#bot and MV b(Vaot are the 

25 forward and backward motion vectors, respectively. 

The motion vectors for the bottom field are 
calculated in a parallel manner to the top field 
motion vectors, as follows: 
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MV b#bot = ((TR B>bot - TR 0>bot ) * MVfet )/ T V x 
if MV D «0 ? and 

MV b . fcot = (MV f . bot - MV tofe ) if MV D *0. 
5 TR B f bot is t]tie temporal distance between the past 

reference field (e.g., top or bottom), which is the 
bottom field 4 05 in this .example, and the bottom 
field 425 of the current B-VOP 420. TRo <bofc is the 

temporal distance between the past reference field 
10 (e.g., top or bottom), which i-s the .bottom field 405 

in this example, and the future bottom reference 
field 445. 

Regarding the examples of FIGs <4 and 5, the 
calculation of TR s , top( TR Dtop , TR B#bot and TR c , bot 
15 depends not only on the current field, reference 

field, and frame temporal references, but also on 
whether the current video is top field first or 
bottom field first. In particular, 

TR D< fce p or TR D>bot = 2* < TR £ttturo - + 8; and 

20 TR Bftop or TR 8 , bot = 2* (TR curront - TR^) + « 

where TR fucure , TR current , and TR pa9t are the frame 
number of the future, current a-nd past frames, 
respectively, in display order, and 5, an addizive 
correction to the temporal distance between fields, 

25 is given by Table 5, below. 5 has units of afield 

periods . 
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For example, the designation "1" in the last 
row of the first column indicates that the future 
anchor field is the top field, and the referenced 
field is the bottom field. This is shown in PTC . 4. 
5 The designation "l" in the last row of the second 

column indicates that the future anchor field is the 
bottom field, and the referenced field is also the 
bottom field. This is shown in PIC 5. 



Table 5 - Temporal .correction, -5 



Referenced Field 


Bottom Field 
First 


Top Field First 


Future 

Anchors 

top 


Future 

Anchor= 

bottom 


Top 

Field 5 


Bottom 
Field 8 


Top 

Field 5 


Bottom 
Field 8 


top 


top 


0 


-1 


0 


l 


top 


bottom 


0 


0 


0 


0 


bottom 


top 


1 


-1 


-1 


1 


bottom 


bottom 


1 


0 


-1 


0 



10 For efficient coding, an appropriate coding 

mode decision process is required. As indicated, 
for B-VOPs, a MB can be coded using (1) direct 
coding, (2) 16x16 motion compensated -(includes 
forward, backward and averaged modes), or <3) field 

15 motion compensation (includes forward, backward and 

averaged modes) . Frame or field -direct coding of a 
current MB is used when the corresponding future 
anchor MB is frame or field direct coded, 
respectively . 

20 For a field motion compensated MB in a B-VOPs, 

a decision is made to code the MB in a f orward , 
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backward, or averaged mode based on the minimum 
luminance half -pixel SADs with respect to the 
decoded anchor pictures. Specifically, seven biased 
SAD terms are calculated as follows: 

5 

(1) • SAD toet +b a , <2) SAD ;orwar<a+ to 2f (3) SADj^^+ba, 

(4) SAD ave „ q ^b 3/ (5) SM> fWMra>£i . :d +b 3i 

(€) SAT' backwar(i , £leld +b 3 , and (7) SAD AVttv ^ f ield +b 4 . 

10 where the subscripts indicate direct mode, forward 

motion prediction, backward motion prediction, 
average (i.e., interpolated or bi-directional) 
motion prediction, frame mode (i.e., locally 
progressive) and field mode (i.e., locally 

15 interlaced). The field SADs abcve (i.e., 

SAD forward, field' 3 ^backward, field' and SAD a^er a ge, field > ar ^ 

the sums of the top and bottom field SADs , each with 
its own reference field and motion vector. 
Specifically, SAD forward#fieid = SAD £orward , tcp £icid + 

2 0 SAD for-ard,botcam field'* SAD bac*vard, f iel d = £AD backward, top fluid * 

SAD bac<ward ( bo-.tom field' and SAD average , Cield " 5 ^average, top 
field SAD average.boctom field- 

SAD direct is the best direct mode prediction, 
SAD £orVArd is the best 16x16 prediction from the 
2 5 forward (past) reference, SAD hackward is the best 

l€xl6 prediction from the backward (future) 
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reference, SAD average is the best 16x16 prediction 
formed by a pixel -by-pixel average of the best 
forward and best backward reference, SM3 £orward(f ie id is 
che best field prediction from the forward <past) 
reference. SAD backMard , fleld is the best field 
prediction from the backward (future) reference, and 
SAD t - „ is the best field prediction formed by 

a pixel-by-pi-xel average of the best forward and 
best backward reference. 

The bi l s are bias values as defined in Table -6, 
below, to account for prediction modes which require 
more motion vectors. Direct mode and modes with 
fewer MVs are favored. 



Table -6 



Mode 


Number of 

motion 

vectors 


b. 


Bias 


Value 


Direct 


1 


b. 


- (Nb/2 + 1) 


-125 


Frame , 
forward 


1 


b 2 


0 


0 


Frame , 
backward 


1 


k 2 


0 


O 


Frame , 
average 


2 


to, 


(Nb/4 + 1) 


65 


Field, 
forward 


2 


b, 


(Nb/4 + 1) 


65 


Field, 
backward 


2 


t>, 


(Nb/4 + 1) 


^65 


Field, 
average 


4 


b 4 


(Nb/2 + 1) 


12S 



(64) 4$M¥ 1 1-75191 



The negative bias for direct mode is for consistency 
with the existing MPEG-4 VM for progressive video, 
and may result in relatively nore skipped MBs. 

FIG. 7 is a block diagram of a decoder in 
accordance with the present invention. The decoder, 
shown generally at 700, can be used to receive and 
decode the encoded data signals transmitted from che 
encoder of FIG. 2. The encoded video image dLa.a and 
differentially encoded motion vector !MV) data are 
received at terminal 740 and provided to a 
demultiplexer (DEMUX) 742. The encoded video image 
data is typically differentially encoded in DCT 
transform coefficients as a prediction error signal 
(e.g., residue) . 

A shape decoding function 744 processes the 
data when the VOP has an arbitrary shape to recover 
shape information, which is, in turn, provided to a 
motion compensation function 7 SO and a VOP 
reconstruction function 752. A texture decoding 
function 746 performs: an inverse DCT on transform 
coefficients to recover residue information. For 
INTRA coded macrdblocks (MBs), pixel information is 
recovered directly and provided to the VOP 
reconstruction function 7S2. 

For inter coded blocks and MBs, such as those 
in B-VOPs, the pixel information provided from the 
texture decoding function 746 to the reconstructed 
VOP function 752 represents a residue between the 
current MB and a reference image . The reference 
image may be pixel data from a single anchor MB 
which is indicated by a forward or backward MV. 
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Alternatively, for an interpolated (e>.g., averaged) 
MB, the reference image is an average of pixel data 
from :wo reference MBs, e.g., one past anchor MB and 
cne future anchor MB. in this case, the decoder 
must calculate the averaged pixel data according to. 
the forward and backward MVs before recovering the 
current MB pixel data. 

?or INTER coded blocks and MBs, a motion 
decoding function 74B processes t*ie encoded MV data 
to recover the differential MVs and provide them to 
the notion compensation function 750 and to a motion 
vector memory 749, -such as a RAM. The motion 
compensation function 7S3 receives the differential 
MV data and determines a reference motion vector 
15 (e.g.. predictor motion vector, or PMV) in 

accordance with the present invention. *Ehe PMV i-s 
determined according to the coding mode <e.g., 
forward, backward, bi-directional, or direct) . 
Once the motion compensation function 7S0 
20 determines a full reference MV and sums it with the 

differential MV of the current MB, the full MV of 
the current ME is available. Accordingly, the 
motion compensation funcrion 75 C can now retrieve 
anchor frame best raa^ch data from a VOP memory 754, 
25 such as a RAM, calculate an averaged image if 

required, and provide the anchor frame pixel data to 
the VOP reconstruction function to reconstruct the 
current MB . 

The retrieved or calculated best match data is 
added back to the pixel residue at the VOP 
reconstruction function 752 to obtain the decoded 



30 
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current MB or block. The reconstructed block is 
output as a video output signal and also provi-ded so 
the VOP memory 754 to provide new anchor frame data. 
Note that an appropriate video data buffering 
5 capability may be required depending on the Urartie 

transmission and presentation orders since an anchor 
frame for a B-VOP MB may be a temporally future 
frame or field, in presentation order. 

FIG. 8 illustrates a MB packet structure in 
10 accordance with the present invention. The 

structure is suitable for B-VOPs, anc indicates the 
format of data received by the decoder. Note that 
the packets are shown in four rows for convenience 
only. The packets are actually transmitted 
15 serially, starting from the top row, and from left 

to right within a row. The first row 810 includes 
fields f irst_shape_code, MVD_shape , CR, ST and BAC . 
A second row 830 includes fields MODE and MBTYPE . A 
third row 850 includes fields CBPB, DQUANT, 
20 Interlaced_inf ormation, MVD f , MVD b , and mvdb . A 

fourth row includes fields -CODA, CBPBA, Alpha 31ock 
Data and Block Data. Bach of the above fields is 
defined according to MP EG -4 VM 8.0. 

Z irst_shape_code indicates whether a MB is in a 
25 bounding box of a VOP. CR indicates a conversion 

ratio for Binary Alpha Blocks . ST indicates a 
horizontal or vertical scan order . SAC refers to a 
binary arithmetic codeword. 

MODE , which indicates the node of a MB , is 
30 present for every coded (non- skipped) MB in a B-VOP. 

Difference motion vectors <MVD<, MVD b> or MVDB) and 
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CBPB are present if indicated by MODB . Macroblcck 
type is indicated by MBTYPE, which also signals 
motion vector modes (MVDs) and quantization 
(DQUANT) . with interlaced mode, there can be up to 
four MVs per MB. MB TYPE indicates the coding type, 
e.g., forward, backward, bi -directional or direct. 
CBPB is the Coded Block pattern for a B-type 
macrcblock: . CBPBA is similarly defined, as CBPB 
except that it has a maximum of four bits. -DQUANT 
defines changes in the value of a quantizer. 

The field interlaced_information in the thi-rd 
row 850 indicates whether a MB is interlaced coded, 
and provides field MV reference data which informs 
the decoder of the coding mode of the current MB or 
block. The deccder uses this information in 
calculating the MV for a current MB. The 
Interlacec__inf orraation field may be stored for 
subsequent use as required in the MV memory 749 or 
other memory in the decoder. 

The lnterlacec__inf ormation field may also 
include a flag dct_type which indicates whether top 
and bottom field pixel lines in a field coded MB are 
reordered from the interleaved order, as discussed 
acove in connection with FIG. 6. 

The MB layer structure shown is used wh«n 
VOP _predi-ction_type== 10. If COD indicaz^es skipped 
(COD == "1") for a MB in the oosc recently decoded 
I- or P-VOP then the co-located <e.g., co-sited} MB 
in the B-VOP is also skipped. That is, no 
information is included in the bitstream. 
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MVD C is the motion sector of a MB in B-VO? with 
respect to a temporally previous reference VOP (an 
I- or a P-VOP) . It consists of a variable length 
codeword for the horizontal component followed by a 
5 variable length codeword for the vertical component. 

For an interlaced B-VOP MB with f ield_prediction of 
"1" and KBTYFE of forward or interpolate, MVD e 
represents a pair cf field motion vectors (top field 
followed by bottom field) which reference the past 
10 anchor VCP . 

MVD b is the motion vector of a MB in B-VOP with 
respect to temporally following reference VOP (an 
I- or a P-VOP) . it consists of a variable length 
codeword for the horizontal component followed by a 

15 variable length codeword for the vertical component. 

For an interlaced B-vop MB with field prediction of 
"1" and METYpe of backward or interpolate, MVD b 
represents a pair of field MVs (top field followed 
by bottom field) wnich reference the future anchor 

20 VOP . 

MVD3 is only present in E-VOPs if direct mode 
is indicated by MODE and M3TYPE, and consists of a 
variable length codeword for the horizontal 
component followed by a variable length codeword for 

25 the vertical component of each vector. MVDSs 

represents delta vectors than are used to correct B- 
VO? MB motion vectors which are obtained by scaling 
P-VOP MB motion vectors. 

CODA refers to gray scale shape coding. 

30 The arrangement shown in fig. 8 is an example 

only and that various other arrangements for 
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communicating the relevant information to the 
decoder will become apparent to those skilled in the 
art . 

A bit stream syntax ar.d MB layer syntax for use 

5 in accordance with the present invention is 

described in MPEG-4 VM 8 . 0 as well as the Eifrig et 

al . application referred to previou-sly. 

Accordingly, it can be seen that the present 

invention provides a scheme for encoding a current 
* 

10 MB in a B-VOP, in particular, when the current MB is 

field ceded, and/or an anchor MB is field -coded. A 
scheme for direct coding for a field coded MB is 
presented, in addition to a coding decision process 
which uses the minimum of sum of absolut-e 

15 differences terms to select an optimum mode. A 

prediction Tiotion vector <PMV) is also provided for 
the top and bottom field of a field coded current 
MB, including forward and backward PMVs as required, 
as well as for frame coded MBs . 

2 0 Although the invention has been described in 

connection with various specific embodiments, those 
skilled in the art will appreciate that numerous 
adaptations and modifications may be made thereto 
without departing from the spirit and scope of -the 

25 invention as set forth in the claims. 
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4 .Brief Descriprion of Drawings 



FIG. l is an illustration of a video obj-ect 
plane (VOP) coding and decoding process in 
accordance with the present invention. 
5 FIG. 2 is a block diagram of an encoder in 

accordance with the present invention. 

FIG. 3 illustrates an interpolation scheme for 
a half -pixel search. 

FIG. 4 illustrates direct mode coding of the 
10 top field of an interlaced-coded B-VOP in accordance 

with the present invention. 

FIG. 5 illustrates direct mode coding of the 
bottom field of an interlaced-coded B-VOP in 
accordance with the present invention. 
15 FIG. 6 illustrates reordering of pixel lines in 

an adaptive frame/field prediction scheme in 
accordance with the present invention. 

FIG. 7 is a block: diagram of a decoder in 
accordance with the present invention . 
20 FIG. 8 illustrates a macroblock layer structure 

in accordance with the present invention. 
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1. Abstract 



A system for coding of digital video images 
5 such as bi-directionally predicted video object 

planes (B-VOPs) (420) , in particular, where the 
B-VOP and/ or a reference image (400,440) used to 
code the B-VOP is interlaced coded. For a B-VOP 
macroblock (420) which is co-sited with a field 

10 predicted macroblock of a future anchor picture 

(44 0) , direct mode prediction is made by calculating 
four field mocion vectors (MV., top , MV fiboc , MV b/tcp , 
MVu^ot) / then generating the prediction macroblock. 
The four field motion vectors and their reference 

IS fields are determined from (1) an offset term 

(MV D ) of the current macroblock'-s coding vector, (2) 
the two future anchor picture field motion vectors 
(MV. op/ MV^J , (3) the reference field (405,410) used 
by the two field motion vectors of the co-sited 

20 future anchor macroblock, and <4) the temporal 

spacing (TR*,,^, TR b>fcot , TR D cop , TR^J , in field 
periods, between the current B-VOP fields and the 
anchor fields. Additionally, a coding mode decision 
process for t-he current MB selects a forward, 

25 backward, or average field coding mode according to 

a minimum sun of absolute differences 4SAZ>) error 
which is obtained over the icp (430) and bottom 
(42 5) fields of the current MB (420) . 



2.Representative Drawing 
Figure 1 
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